Floral homeotic genes for manipulation of flowering in poplar and other plant species

ABSTRACT

Four floral homeotic genes from Poplar are disclosed. The disclosed nucleic acid molecules are useful for producing transgenic plants having modified fertility characteristics, particularly sterility.

CROSS REFERENCE TO RELATED APPLICATION

[0001] This is a continuation-in-part of U.S. patent application Ser. No. 09/287,700 filed Apr. 6, 1999 which claims the benefit of U.S. Provisional Application No. 60/080,851, filed Apr. 6, 1998, both of which applications are incorporated herein by reference.

FIELD OF THE INVENTION

[0002] This invention relates to nucleic acid molecules isolated from Populus species, and methods of using these molecules and derivatives thereof to produce plants, particularly trees such as Populus species, that have modified fertility characteristics.

BACKGROUND OF THE INVENTION

[0003] The increasing demand for pulp and paper products and the diminishing availability of productive forest lands are being addressed in part by efforts to develop trees that produce increased yields in shorter growth periods. Many such efforts are focused on the production of transgenic trees having modified growth characteristics, such as reduced lignin content (see for example, U.S. Pat. No. 5,451,514, “Modification of Lignin Synthesis in Plants”), and resistance to insect, viruses and herbicides. A major concern with the production of transgenic trees is the possibility that the transgenic traits might be introduced into indigenous tree populations by cross-fertilization. Thus, for example, the introduction of genes for insect resistance into indigenous tree populations could accelerate the evolution of resistant insects, adversely affect endangered insect species and interfere with normal food chains. Because of these concerns, the U.S. and other governments have instituted regulatory review processes to assess the risks associated with proposed environmental releases of transgenic plants (both for field trials and commercial production).

[0004] Genetic engineering of sterility into trees offers the possibility of securing introduced genes in the engineered tree; trees that produce neither pollen nor seeds will not be able to transmit introduced genes by normal routes of reproduction. Additional potential benefits of engineering sterility into trees include increased wood yields and reduced production of allergens such as pollen. For a review of engineering reproductive sterility in forest trees, see Strauss et al. (1 995a,b).

[0005] Two primary methods for engineering sterility have been described. In the first method, termed genetic ablation, a cytotoxic gene is expressed under the control of a reproductive tissue-specific promoter. Cytotoxic genes employed in this method to date include RNase (Mariani et al., 1990; Mariani et al., 1992; Reynarts et al., 1993; Goldman et al., 1994), ADP-ribosyl transferase (Thorsness et al., 1991; Kandasamy, 1993; Thorseness et al., 1993), the Agrobacterium RolC gene (Schmülling, 1993), and glucanase (Worrall et al., 1992, Paul et al., 1992). The expression of the cytotoxic gene results (ideally) in the death of all cells in which the reproductive tissue-specific promoter is active. It is therefore critical that the promoter be highly specific to the reproductive tissue to avoid pleiotropic effects on vegetative tissue. For this reason, genome position effects on the transgene need to be monitored (see Strauss et al., 1995a,b). The success of genetic ablation methods in trees will thus depend on the availability of a suitable reproductive tissue-specific promoter for the tree species in question.

[0006] The second method for engineering sterility involves inhibiting the expression of genes that are essential for reproduction. This can be accomplished in a number of ways, including the use of antisense RNA, sense suppression and promoter-based suppression. Details and applications of antisense (Kooter, 1993; Mol et al., 1994; Van der Meer et al., 1992; Pnueli et al., 1994), sense suppression (Flavell, 1994; Jorgensen, 1992; Taylor et al., 1992) and promoter-based suppression (Brusslan et al., 1993; Matzke et al., 1993) technologies in plants have been described in the scientific literature. The key to the use of any of these methods in the production of sterile trees is the identification of appropriate indigenous genes, i.e, disruption of the expression of such genes must result in the abolition of correct reproductive tissue development.

[0007] Genes specifically expressed in reproductive tissues have been isolated from a number of plant species (for a review, see Strauss et al., 1995a). Genes that have been characterized as acting early in the development of floral structures include LEAFY (LFY) from Arabidopsis (Weigel et al., 1992), APETALA1 (AP1) from Arabidopsis (Mandel et al, 1992a,b), and FLORICAULA (FLO) from Antirrhinum (Coen et al., 1990), which regulate the transition from inflorescence to floral meristems. APETALA2 (AP2) appears to regulate the AGAMOUS gene (AG) which plays a role in differentiation of male and female floral tissues (see Okamuro et al., 1993). DEFICIENS (DEF) is a floral homeotic gene from Antirrhinum that is expressed throughout flower development (Schwarz-Sommer et al. 1992).

[0008] The majority of floral homeotic genes are members of the MADS-box family of transcription factors (Yanofsky et al., 1990). The MADS-box is a conserved region of approximately 60 amino acid residues. MADS is an acronym for the first four known genes in which the MADS-box was identified: yeast minichromosomal maintenance factor (MCM1), the floral homeotic genes AG and DEF, and human serum response factor (SRF). Plant MADS-box genes contain four domains: the highly conserved MADS-box region located near or at the 5′ end of the translated region in plant genes; the L or linker region between the MADS and K domains; the K domain, a moderately conserved keratin-like region predicted to form amphipathic α-helices; and a highly variable carboxy-terminal region. The K-box is only present in plant MADS-box genes. It is thought to be involved in protein-protein interactions (Pnueli et al., 1991).

[0009] Studies have shown that the organization of the MADS domain in plants is similar to that in SRF; the basic N-terminal portion of the domain is required for DNA-binding and the C-terminal half of the box is required for dimerization. Because MADS proteins bind DNA as dimers, the MADS box as well as a C-terminal extension that is involved in dimerization are required for DNA-binding. The C-terminal extension varies throughout the gene family. C-terminal deletions indicate that the minimal DNA-binding domain of AP1 and AG includes the MADS-box and part of the L region, whereas AP3 and PI require a portion of the K box in addition to the MADS and L regions (Riechmann et al., 1996). The difference in the sizes of the minimal binding domains is thought to reflect the dimerization characteristics of the respective proteins: AP1 and AG bind DNA as homodimers whereas AP3/PI and their Antirrhinum homologs DEF/GLO bind as heterodimers.

[0010] MADS-box proteins have been found to bind to a motif found in target gene promoters referred to as the CArG-box. CArG-box motifs are also found in the promoters of MADS-box genes, where they are thought to be targets for auto-regulation. Riechmann et al. (1996) used circular permutation and phasing analysis to detect conformational changes in DNA that resulted from MADS-box protein binding (Reichmann et al., 1996). They found that bound AP1, AP3/PI, and AG all induce DNA bending oriented toward the minor groove. For a review of MADS box biology, see Ma, 1994; Purugganan et al., 1995; and Yanofsky, 1995. AG and DEF have been characterized as MADS box genes; while FLO and LFY appear to encode transcription factors and have proline-rich and acidic domains, they are not MADS box genes.

[0011] Following a functional analyses of MADS box genes, Mizukami et al. (1996) created deletion mutants of AG in which various domains of the gene, including the MADS and K boxes were deleted. Based on their results, they proposed that dominant negative mutations of MADS box genes could be created by deleting the all or part of the MADS domain, or by deleting all or part of the K domain or by deleting various portions of the 3′ region of the AG open reading frame. It was proposed that the proteins encoded by these deletion mutants would be able to bind either the target DNA (i.e., the nucleotide sequence to which the transcription factor binds) or the protein co-factors required for transcription, but not both. Thus, it was proposed that such mutant proteins would interfere with the functioning of the coexisting corresponding endogenous gene. The studies of floral homeotic genes discussed in the preceding paragraphs have been primarily undertaken in model plants such as Arabidopsis and Antirrhinum; few, if any, studies have addressed the genetics of flowering in tree species at the molecular level.

[0012] Species of the genus Populus are becoming increasingly important in the forestry industry, particularly for pulp and paper production, in part because of their fast growth characteristics. This group includes aspens (species of Populus section Leuce and their hybrids), and hybrids between black cottonwood (P. trichocarpa Torr. and Gray, also classified as P. balsamifera subsp. trichocarpa; Brayshaw, 1965) and eastern cottonwood (P. deltoides L.). These species are also well suited to manipulation by genetic engineering because they are fast-growing, have relatively small genomes, are easy to regenerate in vitro, and are susceptible to transformation with Agrobacterium. To date however, relatively few genes have been cloned from these species. Notably, the genetic basis underlying floral development in these species is almost completely uncharacterized.

[0013] Floral development in the genus Populus is significantly different from what is seen in a typical hermaphroditic annual (Nagaraj, 1952; Boes and Strauss, 1994). The apices of the branches do not become inflorescences. The flowers are borne on axillary inflorescences, or catkins, with male and female flowers found on separate trees, although occasionally mixed inflorescences or hermaphroditic flowers are seen. The inflorescences appear from dormant buds in the spring, usually occurring from about five years of age. Instead of the usual structure of four concentric whorls of organs (sepals outermost, followed by petals, then stamens surrounding one or more carpels in the center), the Populus flower apparently has only two whorls (a reduced perianth cup surrounding either stamens or carpels). Unlike several other species that produce unisexual flowers through developmental arrest or degeneration of one set of organs (Cheng et al., 1983; Grant et al., 1994), Populus does not initiate male organs in female flowers or vice versa (Boes and Strauss, 1994; Sheppard, 1997). After releasing pollen or seeds, the entire inflorescences are shed (Kaul, 1995). By late spring, the inflorescence buds for the next year's flowers have already been initiated in the axils of the current year's leaves, and will develop for several more months before going dormant.

[0014] The availability of genes that control floral development in Populus species would permit the production of genetically engineered sterile trees. In turn, the ability to control fertility of Populus trees in this way would be of great value in environmental and biosafety of Populus trees engineered for improved agronomic characteristics. It is to such genes that the present invention is directed.

SUMMARY OF THE INVENTION

[0015] The present invention provides four floral homeotic genes from Populus trichocarpa. The four genes are herein termed PTLF, PTD, PTAG-1 and PTAG-2. These genes are homologs of floral homeotic genes isolated from other plant species. Specifically, PTLF is a homolog of LEAFY (LFY) and FLORICAULA (FLO), PTD is a homolog of DEFICIENS (DEF) and PTAG-1 and PTAG-2 are homologs of AGAMOUS (AG). The Populus genes are shown to be expressed in floral tissues; for example, PTLF is expressed in immature inflorescences on which floral promordia are developing, whereas PTD is expressed strongly in stamen primordia from the onset of organogenesis. PTD is also expressed at low levels in carpel primordia.

[0016] The invention provides the nucleic acid sequences of these four Populus genes, the corresponding cDNA sequences and the deduced amino acid sequences of the encoded polypeptides. Along with these sequences, the present invention also provides methods of using the gene and cDNA sequences to produce genetically engineered Populus species and other trees having modified fertility characteristics, including sterility.

[0017] Genetic constructs useful in producing genetically engineered Populus and other trees include antisense versions of PTLF, PTD, PTAG-1 and PTAG-2, dominant negative mutants of these genes, and constructs useful for sense suppression. In addition, the promoter sequences of these genes may be used to obtain floral-specific expression of genes such as cytotoxins that may be employed in genetic ablation strategies to produce trees having modified fertility characteristics, including sterility.

[0018] In one aspect, the invention provides isolated nucleic acid molecules comprising portions of the disclosed nucleic acid sequences. Such molecules comprise at least 15 consecutive nucleotides of the disclosed PTLF, PTD, PTAG-1 or PTAG-2 nucleic acid sequences, and may be longer, comprising at least 20, 25, 50, or 100 consecutive nucleotides of these sequences. Such molecules are useful, among other things, as primers and probes for amplifying all or parts of the disclosed sequences and for detecting the expression of the nucleic acid molecules in cells, such as cells of transgenic plants. Thus, in one aspect, such molecules are useful to monitor the expression of transgenes comprising some portion of the PTD, PTLF, PTAG-1 or PTAG-2 molecules.

[0019] Modification of the fertility traits of plants, such as Populus species may also be obtained by introducing genetic constructs containing variants of all or portions of the disclosed PTD, PTLF, PTAG-1 or PTAG-2 sequences. Such variants are provided by the invention and may comprise a nucleotide sequence of at least 50 (or, for example, at least 100) nucleotides in length which sequence hybridizes under stringent conditions to the disclosed nucleic acid sequences. Alternatively, such variants may share a specified percentage of sequence identity with the disclosed nucleic acid sequences (e.g., at least 75% or at least 90% sequence identity) as determined using a specified sequence alignment program.

[0020] The disclosed nucleic acid molecules and variant forms of these molecules may be assembled in nucleic acid vectors for introduction into cells, such as plant cells. Thus, another aspect of the invention comprises the disclosed nucleic acid molecules and variants thereof, and vectors comprising these molecules.

[0021] In another embodiment, the invention provides transgenic plants comprising the vectors. Such transgenic plants may have altered phenotypes (compared to non-transgenic plants of the same species) including modified fertility characteristics. Modified fertility characteristics include modifications in the timing of flowering, for example, advancing the timing of flowering relative to non-transgenic plants of the same species, and sterility. Sterility may be complete sterility, or may be male only or female only sterility. Examples of transgenic plants provided by the present invention include genetically engineered sterile Populus and Eucalyptus species.

[0022] In another embodiment, the invention provides transgenic plants that comprise a recombinant expression cassette, wherein the recombinant expression cassette comprises a promoter sequence operably linked to a first nucleic acid sequence, and wherein the first nucleic acid sequence comprises all or part of one of the disclosed nucleic acid molecules, or a variant of one of the disclosed nucleic acid molecules. By way of example, such transgenic plants include plants in which the first nucleic acid is arranged in reverse orientation to the promoter sequence in the recombinant expression cassette, such that an antisense RNA is produced. In another example, such transgenic plants include plants in which the first nucleic acid is a dominant negative mutant of PTD, PTLF, PTAG-1 or PTAG-2, produced by deletion of part of the coding region, such as the 3′ portion of the open reading frame, or all or part of a MADS or K-box region of the coding region. In other embodiments, the promoter sequence driving expression of the first nucleic acid may be a promoter that confers enhanced expression of the first nucleic acid molecule in floral tissues of the plant relative to non-floral tissues.

[0023] In other embodiments, the expression of at least one endogenous gene in transgenic plants containing such a recombinant expression cassette will be modified as a result of the cassette. In particular embodiments, that modified expression will affect the fertility of the plant, and will render the plant sterile.

[0024] In yet other embodiments, the invention provides transgenic plants comprising a recombinant expression cassette, wherein the recombinant expression cassette comprises a promoter sequence operably linked to a first nucleic acid sequence, and wherein the promoter sequence is a promoter sequence from PTD, PTLF, PTAG-1 or PTAG-2. In particular embodiments, the first nucleic acid sequence encodes a cytotoxic polypeptide.

[0025] These and other aspects of the invention are described in more detail below.

Sequence Listing

[0026] The nucleic and amino acid sequences listed in the accompanying Sequence Listing are showed using standard letter abbreviations for nucleotide bases, and three letter code for amino acids. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood to be included by any reference to the displayed strand.

[0027] Seq. I.D. No. 1 shows the nucleic acid sequence of the PTD gene. The sequence comprises the following regions: Nucleotide numbers Feature   1-1872 5′ regulatory region 1752-1756 probable CAAT box 1782-1786 probable CAAT box 1845-1851 probable TATA box 1873-2188 Exon 1 (including inferred 5′ UTR) 2189-2327 Intron 1 2328-2394 Exon 2 2395-2484 Intron 2 2485-2546 Exon 3 2547-2652 Intron 3 2653-2752 Exon 4 2753-3309 Intron 4 3310-3351 Exon 5 3352-3432 Intron 5 3433-3477 Exon 6 3478-3584 Intron 6 3585-4000 Exon 7 3765-4285 3′ regulatory region (including 3′ UTR) 3765-4000 3′ UTR

[0028] The sequence comprises the following regions: Amino Acid numbers Feature 1-57 MADS domain 87-154 K-domain

[0029] Seq. I.D. No. 5 shows the nucleic acid sequence of the PTLF gene. The sequence comprises the following regions: Nucleotide numbers Feature   1-2638 5′ regulatory region 2477-2481 probable CAAT box 2536-2542 probable TATA box 2568-2574 probable TATA box 2628-3074 Exon 1 3075-3655 Intron 1 3656-3990 Exon 2 3991-4679 Intron 2 4680-5197 Exon 3 5043-5197 3′ UTR 5043-5656 3′ regulatory region (including 3′ UTR)

[0030] Nucleotide numbers Feature   1-2410 5′ regulatory region 2411-2588 Exon 1 2589-3056 Intron 1 3057-3296 Exon 2 3297-8161 Intron 2 8162-8243 Exon 3 8244-8894 Intron 3 8895-8956 Exon 4 8957-9041 Intron 4 9042-9141 Exon 5 9142-9284 Intron 5 9285-9326 Exon 6 9327-9529 Intron 6 9530-9571 Exon 7 9572-9711 Intron 7 9712-9878 Exon 8  9879-10930 Intron 8 10931-11215 Exon 9 10935-11485 3′ regulatory region (including 3′ UTR) 10935-11215 3′ UTR

[0031] Polypeptide. The sequence comprises the following regions: Amino Acid numbers Feature 17-72 MADS domain 106-172 K-domain

[0032] Seq. I.D. No. 13 shows the nucleic acid sequence of the PTAG-2 gene. The sequence comprises the following regions: Nucleotide numbers Feature   1-2336 5′ regulatory region 2118-2122 probable CAAT box 2256-2262 probable TATA box 2337-2421 Exon 1 2422-2913 Intron 1 2914-3153 Exon 2 3154-7035 Intron 2 7036-7117 Exon 3 7118-7946 Intron 3 7947-8008 Exon 4 8009-8094 Intron 4 8095-8194 Exon 5 8195-8331 Intron 5 8332-8373 Exon 6 8374-8529 Intron 6 8530-8571 Exon 7 8572-8700 Intron 7 8701-8863 Exon 8 8864-9396 Intron 8 9397-9691 Exon 9 8863-10007 3′ regulatory region (including 3′ UTR) 8863-8863 joined to 9397-9691 3′ UTR

[0033] Polypeptide. The sequence comprises the following regions: Amino Acid numbers Feature 16-72 MADS domain 106-172 K-domain

[0034] Amplify portions of the disclosed floral homeotic nucleic acid sequences.

DETAILED DESCRIPTION OF THE INVENTION

[0035] I. Definitions and Abbreviations

[0036] Unless otherwise noted, technical terms are used according to conventional usage. Definitions of common terms in molecular biology may be found in Benjamin Lewin, Genes V, published by Oxford University Press, 1994 (ISBN 0-19-854287-9); Kendrew et al. (eds.), The Encyclopedia of molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0-632-02182-9); and Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 1-56081-569-8).

[0037] In order to facilitate review of the various embodiments of the invention, the following definitions of terms are provided:

[0038] Isolated: An “isolated” biological component (such as a nucleic acid or protein or organelle) has been substantially separated or purified away from other biological components in the cell of the organism in which the component naturally occurs, i.e., other chromosomal and extra-chromosomal DNA and RNA, proteins and organelles. Nucleic acids and proteins that have been “isolated” include nucleic acids and proteins purified by standard purification methods. The term also embraces nucleic acids and proteins prepared by recombinant expression in a host cell as well as chemically synthesized nucleic acids.

[0039] CDNA (complementary DNA): A piece of DNA lacking internal, non-coding segments (introns). cDNA is synthesized in the laboratory by reverse transcription from messenger RNA extracted from cells.

[0040] Oligonucleotide: A linear polynucleotide sequence of up to about 100 nucleotide bases in length.

[0041] Operably linked: A first nucleic acid sequence is operably linked with a second nucleic acid sequence when the first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence. Generally, operably linked DNA sequences are contiguous and, where necessary to join two protein-coding regions, in the same reading frame.

[0042] ORF (open reading frame): A series of nucleotide triplets (codons) coding for amino acids without any termination codons. These sequences are usually translatable into a peptide.

[0043] Ortholog: Two nucleotide or amino acid sequences are orthologs of each other if they share a common ancestral sequence and diverged when a species carrying that ancestral sequence split into two species. Orthologous sequences are also homologous sequences.

[0044] Probes and primers: Molecules useful as nucleic acid probes and primers may readily be prepared based on the nucleic acids provided by this invention. Typically, but not necessarily, such molecules are oligonucleotides, i.e., linear nucleic acid molecules of up to about 100 nucleotides bases in length. However, longer nucleic acid molecules, up to and including the full length of a particular floral homeotic gene may also be employed for such purposes.

[0045] A nucleic acid probe comprises at least one copy (and typically many copies) of an isolated nucleic acid molecule of known sequence that is used in a nucleic acid hybridization protocol. Generally (but not always) the nucleic acid molecule is attached to a detectable label or reporter molecule. Typical labels include radioactive isotopes, ligands, chemiluminescent agents, and enzymes. Methods for labeling and guidance in the choice of labels appropriate for various purposes are discussed, e.g., in Sambrook et al. (1989) and Ausubel et al. (1987).

[0046] Primers are short nucleic acids, usually DNA oligonucleotides 8-10 nucleotides or more in length, and more typically 15-25 nucleotides in length. Primers may be annealed to a complementary target DNA strand by nucleic acid hybridization to form a hybrid between the primer and the target DNA strand, and then extended along the target DNA strand by a DNA polymerase enzyme. Primer pairs can be used for amplification of a nucleic acid sequence, e.g., by the polymerase chain reaction (PCR) or other nucleic-acid amplification methods known in the art.

[0047] Methods for preparing and using probes and primers are described, for example, in Sambrook et al. (1989), Ausubel et al. (1987), and Innis et al., (1990). PCR primer pairs can be derived from a known sequence, for example, by using computer programs intended for that purpose such as Primer (Version 0.5, ©1991, Whitehead Institute for Biomedical Research, Cambridge, Mass.). One of skill in the art will appreciate that the specificity of a particular probe or primer increases with its length. Thus, for example, a primer comprising 20 consecutive nucleotides of the cDNA disclosed in Seq. I.D. No. 2 will anneal to a target sequence such as a homologous sequence in Eucalyptus contained within a Eucalyptus cDNA library with a higher specificity than a corresponding primer of only 15 nucleotides. Thus, in order to obtain greater specificity, probes and primers may be selected that comprise 20, 25, 30, 35, 40, 50, 75, 100 or more consecutive nucleotides of the disclosed nucleic acid sequences.

[0048] The invention thus includes isolated nucleic acid molecules that comprise specified lengths of the disclosed floral homeotic sequences. Such molecules may comprise at least 8-10, 15, 20, 25, 30, 35, 40, 50, 75, or 100 consecutive nucleotides of these sequences and may be obtained from any region of the disclosed sequences. By way of example, the floral homeotic genes shown in the Sequence Listing may be apportioned into halves or quarters based on sequence length, and the isolated nucleic acid molecules may be derived from the first or second halves of the molecules, or any of the four quarters. The PTD cDNA, shown in Seq. I.D. No. 2 may be used to illustrate this. This cDNA is 924 nucleotides in length and so may be hypothetically divided into halves (nucleotides 1-462 and 463-924) or quarters (nucleotides 1-231, 232-462, 463-693 and 694-924). Nucleic acid molecules may be selected that comprise at least 8-10, 15, 20, 25, 30, 35, 40, 50, 75 or 100 consecutive nucleotides of any of these portions of the floral homeotic genes. Thus, one such nucleic acid molecule might comprise at least 25 consecutive nucleotides of the region comprising nucleotides 1-924 of the disclosed floral homeotic genes.

[0049] Purified: The term purified does not require absolute purity; rather, it is intended as a relative term. Thus, for example, a purified PTAG-1 protein preparation is one in which the PTAG-1 protein is more pure than the protein in its natural environment within a cell. Generally, a preparation of a floral homeotic protein is purified such that the floral homeotic protein represents at least 5% of the total protein content of the preparation. For particular applications, higher purity may be desired, such that preparations in which the floral homeotic protein represents at least 50% or at least 75% of the total protein content may be employed.

[0050] Recombinant: A recombinant nucleic acid is one that has a sequence that is not naturally occurring or has a sequence that is made by an artificial combination of two otherwise separated segments of sequence. This artificial combination is often accomplished by chemical synthesis or, more commonly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques.

[0051] Transformed: A transformed cell is a cell into which has been introduced a nucleic acid molecule by molecular biology techniques. As used herein, the term transformation encompasses all techniques by which a nucleic acid molecule might be introduced into such a cell, including Agrobacterium-mediated transformation, transfection with viral vectors, transformation with plasmid vectors and introduction of naked DNA by electroporation, lipofection, and particle gun acceleration.

[0052] Transgenic plant: As used herein, this term refers to a plant that contains recombinant genetic material not normally found in plants of this type and which has been introduced into the plant in question (or into progenitors of the plant) by human manipulation. Thus, a plant that is grown from a plant cell into which recombinant DNA is introduced by transformation is a transgenic plant, as are all offspring of that plant that contain the introduced transgene (whether produced sexually or asexually).

[0053] Vector: A nucleic acid molecule as introduced into a host cell, thereby producing a transformed host cell. A vector may include nucleic acid sequences that permit it to replicate in a host cell, such as an origin of replication. A vector may also include one or more selectable marker genes and other genetic elements known in the art.

[0054] Sequence identity: the relatedness of two nucleic acid sequences, or two amino acid sequences is typically expressed in terms of the identity between the sequences (in the case of amino acid sequences, similarity is an alternative assessment). Sequence identity is frequently measured in terms of percentage identity; the higher the percentage, the more similar the two sequences are. Homologs of a disclosed floral homeotic protein or nucleic acid sequence will possess a relatively high degree of sequence identity when aligned using standard methods.

[0055] Methods of alignment of sequences for comparison are well known in the art. Various programs and alignment algorithms are described in: Smith and Waterman (1981); Needleman and Wunsch (1970); Pearson and Lipman (1988); Higgins and Sharp (1988); Higgins and Sharp (1989); Corpet et al. (1988); Huang et al. (1992); and Pearson et al. (1994). Altschul et al. (1994) presents a detailed consideration of sequence alignment methods and homology calculations.

[0056] The NCBI Basic Local Alignment Search Tool (BLAST) (Altschul et al., 1990) is available from several sources, including the National Center for Biotechnology Information (NCBI, Bethesda, Md.) and on the Internet, for use in connection with the sequence analysis programs blastp, blastn, blastx, tblastn and tblastx. It can be accessed at http://www.ncbi.nlm.nih.gov/BLAST/. A description of how to determine sequence identity using this program is available at http ://www.ncbi.nlm.nih.gov/BLAST/blast help.html.

[0057] Homologs of the disclosed floral homeotic proteins are typically characterized by possession of at least 50% sequence identity counted over the full length alignment with the amino acid sequence of a selected floral homeotic protein using the NCBI Blast 2.0, gapped blastp set to default parameters. Proteins with even greater similarity to the reference sequences will show increasing percentage identities when assessed by this method, such as at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 90% or at least 95% sequence identity. When less than the entire sequence is being compared for sequence identity, homologs will typically possess at least 75% sequence identity over short windows of 10-20 amino acids, and may possess sequence identities of at least 85% or at least 90% or 95% depending on their similarity to the reference sequence. Methods for determining sequence identity over such short windows are described at http://www.ncbi.nlm.nih.gov/BLAST/blast FAQs.html. One of skill in the art will appreciate that these sequence identity ranges are provided for guidance only; it is entirely possible that strongly significant homologs could be obtained that fall outside of the ranges provided. The present invention provides not only the peptide homologs as described above, but also nucleic acid molecules that encode such homologs.

[0058] Homologs of the disclosed floral homeotic nucleic acids are typically characterized by possession of at least 50% sequence identity counted over the full length alignment with the nucleic acid sequence of a selected floral homeotic gene using the NCBI Blast 2.0, blastn set to default parameters. Homologs with even greater similarity to the reference sequences will show increasing percentage identities when assessed by this method, such as at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 90% or at least 95% sequence identity.

[0059] An alternative indication that two nucleic acid molecules are closely related is that the two molecules hybridize to each other under stringent conditions. Stringent conditions are sequence dependent and are different under different environmental parameters. Generally, stringent conditions are selected to be about 5° C. to 20° C. lower than the thermal melting point (T_(m)) for the specific sequence at a defined ionic strength and pH. The T_(m) is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Conditions for nucleic acid hybridization and calculation of stringencies can be found in Sambrook et al. (1989) and Tijssen (1993). Nucleic acid molecules that hybridize under stringent conditions to a disclosed nucleic acid sequences will typically hybridize to a probe corresponding to either the entire cDNA or selected portions of the cDNA under wash conditions of 0.2×SSC, 0.1% SDS at 65° C.

[0060] Nucleic acid sequences that do not show a high degree of identity may nevertheless encode similar amino acid sequences, due to the degeneracy of the genetic code. It is understood that changes in nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid sequence that all encode substantially the same protein.

[0061] Floral Specific Promoter: As used herein, the term “floral specific promoter” refers to a regulatory sequence which confers gene expression only in, or predominantly in, floral tissues. The complete sequences of four floral specific promoters are disclosed herein: the promoter of PTD, located within the 5′ regulatory region comprising nucleotides 1-1872 of Seq. I.D. No. 1; the promoter of PTFL, located within the 5′ regulatory region comprising nucleotides 1-2638 of Seq. I.D. No. 5; the promoter of PTAG-1, located within the 5′ regulatory region comprising 1-2410 of Seq. I.D. No. 9; and the promoter of PTAG-2, located within the 5′ regulatory region comprising nucleotides 1-2336 of Seq. I.D. No. 13). Accordingly, these promoter sequences may be used to produce transgene constructs that are specifically or predominantly expressed in floral tissues. One of skill in the art will recognize that effective floral-specific expression may be achieved with less than the entire promoter sequences noted above. Thus, by way of example, floral-specific expression may be obtained by employing sequences comprising 500 nucleotides or fewer (e.g., 250, 200, 150, or 100 nucleotides) upstream of the start codon, AUG, of the disclosed gene sequences.

[0062] The determination of whether a particular sub-region of the disclosed sequences operates to confer floral specific expression in a particular system (taking into account the plant species into which the construct is being introduced, the level of expression required, etc.), is preformed using known methods, such as operably linking the promoter sub-region to a marker gene (e.g. GUS), introducing such constructs into plants and then determining the level of expression of the marker gene in floral and other plant tissues. Sub-regions which confer only or predominantly floral expression, are considered to contain the necessary elements to confer floral specific expression.

[0063] II. Methods

[0064] The four floral homeotic genes were obtained, and the present invention can be practiced, using standard molecular biology and plant transformation procedures, unless otherwise noted. Standard molecular biology procedures are described in Sambrook et al (1989), Ausubel et al. (1987) and Innis et al. (1990).

[0065] III. Isolation and Characterization of PTLF

[0066] Genomic DNA was purified from dormant vegetative buds of a single Populus trichocarpa tree using a modified CTAB extraction technique (Wagner et al., 1987). After centrifugation to pellet nuclei, a large gummy pellet of resin was evident. This was left intact during the resuspension of nuclei, and then discarded. Normal yield of DNA was approximately 1 mg per 40 g of tissue. A genomic library was constructed from DNA partially digested with Sau3A, filled in with DNA Pol I and dATP and dGTP, and ligated into LambdaGem-12 vector (Stratagene) having partially filled-in Xho I sites. Packaging of the DNA into phage particles was performed with GigaPack Gold II (Stratagene).

[0067] RNA was extracted using the lithium dodecyl sulfate method of Baker et al. (1990), and purified by centrifugation through a 5.7 M CsCl pad. After redissolving the RNA pellet in TE, pH 8.0, NaCl was added to 400 mM and the RNA was precipitated with EtOH to remove excess CsCl. PolyA⁺RNA was selected using oligo dT-cellulose columns (mRNA Separation Kit, Clontech). RNA was stored at −80° C. until use. Ten-microgram samples of total RNA were used as templates for single-stranded cDNA synthesis. Reactions included 50 mM TrisHCl (pH 8.3), 75 mM KCl, 10 mM dithiothreitol, 3 mM MgCl₂, 100 μM each dNTP, 4 μg primer XT, 10 μCi [α³²P]-dCTP, and 200 U M-MLV reverse transcriptase (Gibco BRL) in 50 μL. Incubations were performed at 37° C. for 1 hr, then the cDNA was purified with GeneClean (BIO101) silica matrix. Typical yields were 10-40 ng of cDNA, as determined by ³²P incorporation. The size ranges of the cDNA samples were characterized by alkaline gel electrophoresis. cDNA products were between 500 to 4000 bases in length, with an average size of 1000 bases. The DNA was diluted to 0.25 ng/μL in 10 mM TrisHCl, 1 mM EDTA (pH 8.0) and stored at −20° C.

[0068] cDNA libraries were prepared using the Lambda-ZAP CDNA cloning kit (Stratagene). From 5 μg of polyA⁺ RNA, approximately 10⁶ clones were recovered per preparation, with an average size of 1 kb and a size range of 500 bp to 3 kb. A hybridization probe for the Populus FLO/LFY homolog was obtained by touchdown PCR (Don et al., 1991) of the cDNA library with a degenerate primer specific to a highly conserved region of the FLO and LFY genes and a primer specific for the vector plus 3′-end of polyadenylated cDNAs. The PCR protocol was as follows: (94° C., 30 sec; 60° C., 30 sec; 71° C., 1 min)×2, (94° C., 30 sec; 58° C., 30 sec; 71° C., 1 min)×2, (94° C., 30 sec; 56° C., 30 sec; 71° C., 1 min)×2, (94° C., 30 sec; 54° C., 30 sec; 71° C., 1 min)×2, (94° C., 30 sec; 52° C., 30 sec; 71° C., 1 min)×2, (94° C., 30 sec; 50° C., 30 sec; 71° C., 1 min)×8, (94° C., 30 sec; 52° C., 30 sec; 71° C., 1 min)×25. The approximately 480 bp fragment obtained was gel-purified and subcloned into pBluescript SK(−) for further characterization.

[0069] The PTLF genomic clone was isolated by screening the genomic library using probes derived from the PTLF cDNA sequence. Sequencing of the cDNA was performed using the dideoxy-terminator-based Sequenase 2.0 kit (Unites States Biochemical Corp.), according to the methods described by the manufacturer. Most sequencing of the cDNA and subclones of the gene was done using universal primers on nested deletions created with ExoIII (Henikoff, 1984). Gaps were filled in by sequencing from specific primers synthesized at Oregon State University. Sequence analysis was performed using PCGENE (Intelligenetics).

[0070] A total of 5,656 bp of the PTLF gene locus was sequenced, including 2,638 bp upstream of the initiation codon and 457 bp downstream of the polyA addition site. This sequence is available on GenBank (http://www.ncbi.nlm.nih.gov/Entrez/nucleotide.html) under accession number U93196 and is shown in Seq. I.D. No. 5. The positions of the two introns found in both FLO and LFY are conserved in PTLF. The longest cDNA obtained (Seq. I.D. No. 6) includes an open reading frame (Seq. I.D. No. 7) that encodes for a predicted polypeptide of 377 amino acid residues (Seq. I.D. No. 8). Comparison of the deduced PTLF amino acid sequence with several FLO/LFY homologs revealed conserved amino- and carboxyl-terminal domains (133 and 175 residues, respectively, in PTLF) linked by a poorly conserved, highly charged domain (69 residues). The overall sequence identity between PTLF and FLO (Coen et al., 1990) is 79%, with 88% amino acid sequence similarity.

[0071] Due to the limited seasonal availability of inflorescence and flower tissue, and the difficulty of obtaining large amounts of developing meristems, the levels of PTLF expression were compared using RT-PCR. PTLF was detected most strongly in developing inflorescences, with no significant differences between samples from male and female trees.

[0072] For in situ hybridization analysis, tissue samples from various sources were fixed, embedded, sectioned, and hybridized as described by Kelly et al. (1995), with the following modifications. Sections were 10 μm in thickness. Probes were generated from a plasmid consisting of the PTLF cDNA inserted between the EcoRI and Kpn I sites of the vector pBluescriptII SK (−), and were not alkaline hydrolyzed. A PTLF antisense probe hybridized strongly to the floral meristems and developing flowers of both male and female plants. PTLF was not detected in the apical inflorescence meristem, but was seen in the flanking nascent floral meristems. Developing flowers showed expression in the immature carpels and anthers. Both male and female flowers exhibited some hybridization on the inner (adaxial) rim of the perianth cup during the middle stages of development. PTLF also showed marked hybridization to bracts. Hybridization was observed with vegetative buds from mature branches. The pattern of hybridization showed that there was RNA in the axils of the newly formed leaves, but not in the center of the vegetative meristem. There was also significant expression in the tips of the leaf primordia, and in some portions of the surrounding developing leaves.

[0073] Overexpression and antisense constructs of PTLF cDNA were produced for analysis in transgenic trees. The insert from the cDNA clone of PTLF was cut out using EcoR I and Kpn I, and the ends were polished with T4 DNA polymerase. The insert was then ligated into the Sma I site of pBI121 (Jefferson et al., 1987). Clones with each orientation were identified by PCR, and the structures of the junction sites near the promoters of both were verified by sequencing of the PCR fragments. Hybrid aspens were used for transformation, in part because of the relative ease of transformation, and in part because of concern that transgenic cottonwoods might interact with native cottonwoods in the vicinity of the experimental site. The P. tremula x alba hybrid aspen female clone 717-1B4 and the P. tremula x tremuloides hybrid aspen male clone 353-38 were transformed with pDW151 (Weigel and Nilsson, 1995) and the above binary vectors using Agrobacterium tumefasciens strain C58 (Leple et al., 1992) with modifications as described by Han et al. (1996).

[0074] Although overexpression of LFY in aspens was reported to result in short, bushy plants that flower within a year (Weigel and Nilsson, 1995), no such obvious phenotypes were seen with PTLF. During more than one year of growth in soil in a greenhouse, and an additional year at a field site in Corvallis, OR, few differences were noted for any of the transgenics relative to control plants.

[0075] IV. Isolation and Characterization of PTD

[0076] The PTD cDNA and gene were isolated by probing the Populus cDNA library described above at low stringency using an Eco RI fragment of pCIT2241 (Ma et al., 1991) which contains the MADS box region of AGL1. The PTD cDNA (Seq. I.D. No. 2) comprises an open reading frame (Seq. I.D. No. 3) encoding a 227 amino acid polypeptide (Seq. I.D. No. 4). The PTD gene (Seq. I.D. No. 1) consists of seven exons.

[0077] The PTD polypeptide is 81% conserved overall with respect to DEF. PTD has MADS and K domains. The MADS domain extends over amino acids 1-57, while the K-domain extends over amino acids 87-154. The MADS domain is 93% conserved with respect to DEF, whereas the K domain is 85% conserved at the amino acid level.

[0078] To determine if the promoter of PTD would confer the floral-specific expression, 1.9 kb of its promotor and 5′ untranslated region were fused to a GUS-intron reporter gene, and introduced into Arabidopsis, tobacco and poplar. GUS expression was observed in floral tissues including petals and stamens. This expression pattern is characteristic of a “B function” gene like APETALA3, suggesting that PTD has retained the regulatory motifs (i.e. sequence patterns) that direct it to stamens and petals (though poplar has no true petals). No vegetative GUS expression was observed, except in poplar, where vegetative expression was confined to leaf-like structures subtending induced floral structures.

[0079] V. Isolation and Characterization of PTAG-1 and PTAG-2

[0080] Two cDNAs and their corresponding genes were isolated from Populus using the methodologies described above and a probe derived from the 3′ region of the AG cDNA. Denoted PTAG-1 and PTAG-2, these two sequences are the orthologs of AG.

[0081] The genomic, cDNA and open reading frame sequences of PTAG-1 are shown in Seq. I.D. Nos. 9, 10 and 11, respectively. The open reading frame encodes a polypeptide of 241 amino acids in length (Seq. I.D. No. 12). The PTAG-1 polypeptide contains both a MADS domain and a K-domain. The MADS domain extends from amino acids 17-72 and the K-domain from amino acids 106-172. The PTAG-1 nucleotide and amino acid sequences are available on GenBank under accession number AF052570.

[0082] The genomic, cDNA and open reading frame sequences of PTAG-2 are shown in Seq. I.D. Nos. 13, 14 and 15, respectively. The open reading frame encodes a polypeptide of 238 amino acids in length (Seq. I.D. No. 16). The PTAG-2 polypeptide contains both a MADS domain and a K-domain. The MADS domain extends from amino acids 16-72 and the K-domain from amino acids 106-172. The PTAG-2 nucleotide and amino acid sequences are available on GenBank under accession number AF052571.

[0083] Like AG (Yanofsky et al., 1990), both PTAG1 and PTAG2 contain 8 introns at conserved positions. All introns have canonical donor (GT) and acceptor (AG) sites.

[0084] At the amino acid level, PTAG-1 and PTAG 2 are 89% identical, and show 72-75% sequence similarity with AG.

[0085] Because AG is only expressed in floral tissues and is essential for the development of both male and female reproductive organs, it is ideally suited for use in modifying fertility through genetic engineering approaches. In situ hybridization studies show that the PTAG genes in Populus are expressed in the central zone of both male and female floral meristems, and, as with AG, expression begins before reproductive primordia emerge and continues in developing stamens and carpels. Northern analysis of PTAG gene expression in populus revealed that transcripts are present in immature and mature flowers from both male and female trees. In addition, low levels of PTAG gene expression are present in all vegetative tissues tested. Interestingly, the size of the transcripts from the vegetative tissues are shorter (˜150-200 bp) than the floral transcripts. This size difference is not due to alternate intron/exon splicing.

EXAMPLES

[0086] The following examples are provided to illustrate the scope of the invention.

Example 1 Preferred Method of Making the Populus Genes and cDNAs

[0087] With the provision of the four Populus floral homeotic nucleic acid sequences PTD, PTLF, PTAG-1 and PTAG-2, the polymerase chain reaction (PCR) may now be utilized in a preferred method for producing the cDNAs and genes, as well as derivatives of these sequences. PCR amplification of the sequence may be accomplished either by direct PCR from an appropriate cDNA or genomic library. Alternatively, the cDNAs may be amplified by Reverse-Transcription PCR (RT-PCR) using RNA extracted from Populus cells as a template. Similarly, the gene sequences may be directly amplified using Populus genomic DNA as a template. Methods and conditions for both direct PCR and RT-PCR are known in the art and are described in Innis et al. (1990). Suitable plant cDNA and genomic libraries for direct PCR include Populus libraries made by methods described above. Other tree cDNA and genomic libraries may be used in order to amplify orthologous cDNAs of tree species, such as Pinus and Eucalyptus.

[0088] The selection of PCR primers will be made according to the portions of the cDNA or gene that are to be amplified. Primers may be chosen to amplify small segments of the cDNA or gene, or the entire cDNA or genes. Variations in amplification conditions may be required to accommodate primers of differing lengths; such considerations are well known in the art and are discussed in Innis et al. (1990), Sambrook et al. (1989), and Ausubel et al. (1987). By way of example only, the PTD cDNA molecule as shown in Seq. I.D. No. 2 (with the exception of the 5′ poly-A tail) may be amplified using the following combination of primers: 5′ ATGGGTCGTGGAAAGATTGAAATCAAG 3′ (Seq. I.D. No. 17) 5′ ATTTGTGAAAAAGAGCTTTTATATTTA 3′ (Seq. I.D. No. 18)

[0089] The open reading frame portion of the PTD cDNA may be amplified using the following primer pair: 5′ ATGGGTCGTGGAAAGATTGAAATCAAG 3′ (Seq. I.D. No. 17) 5′ AGGAAGGCGAAGTTCATGGGATCCAAA 3′ (Seq. I.D. No. 19)

[0090] A derivative version of the PTD ORF that lacks the MADS box domain may be amplified using the following primers: 5′ TCCACATCGACAAAGAAGATCTACGAT 3′ (Seq. I.D. No. 20) 5′ AGGAAGGCGAAGTTCATGGGATCCAAA 3′ (Seq. I.D. No. 19)

[0091] These primers are illustrative only; it will be appreciated by one skilled in the art that many different primers may be derived from the provided cDNA and gene sequences in order to amplify particular regions of the provided nucleic acid molecules. Suitable amplification conditions include those described above for the original isolation of the PTLF cDNA. As is well known in the art, amplification conditions may need be varied in order to amplify orthologous genes where the sequence identity is not 100%; in such cases, the use of nested primers, as described above may be beneficial. Resequencing of PCR products obtained by these amplification procedures is recommended; this will facilitate confirmation of the amplified cDNA sequence and will also provide information on natural variation on this sequence in different ecotypes, cultivars and plant populations.

[0092] Oligonucleotides that are derived from the PTD, PTLF, PTAG-1 and PTAG-2 cDNA and gene sequences and which are suitable for use as PCR primers to amplify corresponding nucleic acid sequences are encompassed within the scope of the present invention. Preferably, such oligonucleotide primers will comprise a sequence of 15-20 consecutive nucleotides of the selected cDNA or gene sequence. To enhance amplification specificity, primers comprising at least 25, 30, 35, 50 or 100 consecutive nucleotides of the PTD, PTLF, PTAG-1 or PTAG-2 gene or cDNA sequences may be used.

Example 2 Use of the Populus Genes and cDNAs to Modify Fertility Characteristics

[0093] Once a nucleic acid encoding a protein involved in the determination of a particular plant characteristic, such as flowering, has been isolated, standard techniques may be used to express the nucleic acid in transgenic plants in order to modify that particular plant characteristic. One approach is to clone the nucleic acid into a vector, such that it is operably linked to control sequences (e.g., a promoter) which direct expression of the nucleic acid in plant cells. The transformation vector is then introduced into plant cells by one of a number of techniques (e.g., electroporation and Agrobacterium-mediated transformation) and progeny plants containing the introduced nucleic acid are selected. Preferably all or part of the transformation vector will stably integrate into the genome of the plant cell. That part of the vector which integrates into the plant cell and which contains the introduced nucleic acid and associated sequences for controlling expression (the introduced “transgene”) may be referred to as the recombinant expression cassette.

[0094] Selection of progeny plants containing the introduced transgene may be made based upon the detection of an altered phenotype. Such a phenotype may result directly from the nucleic acid cloned into the transformation vector or may be manifested as enhanced resistance to a chemical agent (such as an antibiotic) as a result of the inclusion of a dominant selectable marker gene incorporated into the transformation vector.

[0095] The choice of (a) control sequences and (b) how the nucleic acid (or selected portions of the nucleic acid) are arranged in the transformation vector relative to the control sequences determine, in part, how the plant characteristic affected by the introduced nucleic acid is modified. For example, the control sequences may be tissue specific, such that the nucleic acid is only expressed in particular tissues of the plant (e.g., reproductive tissues) and so the affected characteristic will be modified only in those tissues. The nucleic acid sequence may be arranged relative to the control sequence such that the nucleic acid transcript is expressed normally, or in an antisense orientation. Expression of an antisense RNA that is the reverse complement of the cloned nucleic acid will result in a reduction of the targeted gene product (the targeted gene product being the protein encoded by the plant gene from which the introduced nucleic acid was derived). Over-expression of the introduced nucleic acid, resulting from a plus-sense orientation of the nucleic acid relative to the control sequences in the vector, may lead to an increase in the level of the gene product, or may result in a reduction in the level of the gene product due to co-suppression (also termed “sense suppression”) of that gene product. In another approach, the nucleic acid sequence may be modified such that certain domains of the encoded peptide are deleted. Depending on the domain deleted, such modified nucleic acid may act as dominant negative mutations, suppressing the phenotypic effects of the corresponding endogenous gene.

[0096] Successful examples of the modification of plant characteristics by transformation with cloned nucleic acid sequences are replete in the technical and scientific literature. Selected examples, which serve to illustrate the level of knowledge in this field of technology include:

[0097] U.S. Pat. No. 5,432,068 to Albertson (control of male fertility using externally inducible promoter sequences);

[0098] U.S. Pat. No. 5,686,649 to Chua (suppression of plant gene expression using processing-defective RNA constructs);

[0099] U.S. Pat. No. 5,659,124 to Crossland (transgenic male sterile plants);

[0100] U.S. Pat. No. 5,451,514 to Boudet (modification of lignin synthesis using antisense RNA and co-suppression);

[0101] U.S. Pat. No. 5,443,974 to Hitz (modification of saturated and unsaturated fatty acid levels using antisense RNA and co-suppression);

[0102] U.S. Pat. No. 5,530,192 to Murase (modification of amino acid and fatty acid composition using antisense RNA);

[0103] U.S. Pat. No. 5,455,167 to Voelker (modification of medium chain fatty acids) U.S. Pat. No. 5,231,020 to Jorgensen (modification of flavonoids using co-suppression);

[0104] U.S. Pat. No. 5,583,021 to Dougherty (modification of virus resistance by expression of plus-sense RNA); and

[0105] Mizukami et al. (1996) (dominant negative mutations in floral development using partial deletions of AG).

[0106] These examples include descriptions of transformation vector selection, transformation techniques and the production of constructs designed to over-express an introduced nucleic acid, dominant negative mutant forms, untranslatable RNA forms or antisense RNA. In light of the foregoing and the provision herein of the PTD, PTLF, PTAG-1 and PTAG-2 cDNA and gene sequences, it is apparent that one of skill in the art will be able to introduce these cDNAs or genes, or derivative forms of these sequences (e.g., antisense forms), into plants in order to produce plants having modified fertility characteristics, particularly sterility. This Example provides a description of the approaches that may be used to achieve this goal. For convenience the PTD, PTLF, PTAG-1 and PTAG-2 cDNAs and genes disclosed herein will be generically referred to as the “floral homeotic nucleic acids,” and the encoded polypeptides as the “floral homeotic polypeptides”. Example 3 provides an exemplary illustration of how an antisense form of one of these floral homeotic nucleic acids, specifically the PTD cDNA, may be introduced into poplar species using Agrobacterium transformation, in order to produce genetically engineered sterile poplars. Example 4 provides an exemplary illustration of how mutant forms of PTAG-1 may be produced and introduced into poplar species to produce modified fertility characteristics.

[0107] a. Plant Types

[0108] The floral homeotic nucleic acids disclosed herein may be used to produce transgenic plants having modified fertility characteristics. In particular, the amenable plant species include, but are not limited to, members of the genus Populus, including Populus trichocarpa (commonly known as black cottonwood, California poplar and western balsam poplar) and poplar hybrid species. Other woody species that are amenable to fertility modification by the methods disclosed herein include members of the genera Picea, Pinus Pseudotsuga, Tsuga, Sequoia, Abies, Thuja, Libocedrus, Chamaecyparis and Larix. In particular, members of the genera Eucalyptus, Acacia and Gmelina, which are becoming increasingly important for pulp production, may be engineered for sterility using the nucleic acid sequences and methods disclosed here.

[0109] b. Vector Construction, Choice of Promoters

[0110] A number of recombinant vectors suitable for stable transfection of plant cells or for the establishment of transgenic plants have been described including those described in Pouwels et al., (1987), Weissbach and Weissbach, (1989), and Gelvin et al., (1990). Typically, plant transformation vectors include one or more cloned plant genes (or cDNAs) under the transcriptional control of 5′ and 3′ regulatory sequences and a dominant selectable marker. Such plant transformation vectors typically also contain a promoter regulatory region (e.g., a regulatory region controlling inducible or constitutive, environmentally or developmentally regulated, or cell- or tissue-specific expression), a transcription initiation start site, a ribosome binding site, an RNA processing signal, a transcription termination site, and/or a polyadenylation signal.

[0111] Examples of constitutive plant promoters which may be useful for expressing the floral homeotic nucleic acids include: the cauliflower mosaic virus (CaMV) 35S promoter, which confers constitutive, high-level expression in most plant tissues (see, e.g., Odell et al., 1994, Dekeyser et al., 1990, Terada and Shimamoto, 1990; Benfey and Chua, 1990); the nopaline synthase promoter (An et al., 1988); and the octopine synthase promoter (Fromm et al., 1989).

[0112] A variety of plant gene promoters that are regulated in response to environmental, hormonal, chemical, and/or developmental signals, also can be used for expression of the floral homeotic nucleic acids in plant cells, including promoters regulated by: (a) heat (Callis et al., 1988; Ainley, et al. 1993; Gilmartin et al. 1992); (b) light (e.g., the pea rbcS-3A promoter, Kuhlemeier et al., 1989, and the maize rbcS promoter, Schaffner and Sheen, 1991; (c) hormones, such as abscisic acid (Marcotte et al., 1989); (d) wounding (e.g., wunI, Siebertz et al., 1989); and (e) chemicals such as methyl jasminate or salicylic acid (see also Gatz 1997) can also be used to regulate gene expression.

[0113] Alternatively, tissue specific (root, leaf, flower, and seed for example) promoters (Carpenter et al., 1992; Denis et al., 1993; Opperman et al., 1993; Stockhause et al., 1997; Roshal et al., 1987; Schernthaner et al., 1988; and Bustos et al., 1989) can be fused to the coding sequence to obtained particular expression in respective organs. In addition, the timing of the expression can be controlled by using promoters such as those acting at senescencing (Gan and Amasino 1995) or late seed development (Odell et al., 1994).

[0114] The promoter regions of the PTD, PTLF, PTAG-1 or PTAG-2 gene sequences confer floral-specific (or floral-enriched) expression in Populus. Accordingly, these native promoters may be used to obtain floral-specific (or floral-enriched) expression of the introduced transgene.

[0115] Plant transformation vectors may also include RNA processing signals, for example, introns, which may be positioned upstream or downstream of the ORF sequence in the transgene. In addition, the expression vectors may also include additional regulatory sequences from the 3′-untranslated region of plant genes, e.g., a 3′ terminator region to increase mRNA stability of the mRNA, such as the PI-II terminator region of potato or the octopine or nopaline synthase 3′ terminator regions.

[0116] Finally, as noted above, plant transformation vectors may also include dominant selectable marker genes to allow for the ready selection of transformants. Such genes include those encoding antibiotic resistance genes (e.g., resistance to hygromycin, kanamycin, bleomycin, G418, streptomycin or spectinomycin) and herbicide resistance genes (e.g., phosphinothricin acetyltransferase).

[0117] C. Arrangement of Floral Homeotic Nucleic Acid Sequence in Vector

[0118] Modified fertility characteristics in plants may be obtained using the floral homeotic nucleic acid sequences disclosed herein in a variety of forms. Over-expression, sense-suppression, antisense RNA and dominant negative mutant forms of the disclosed floral homeotic nucleic acid sequences may be constructed in order to modulate or supplement the expression of the corresponding endogenous floral homeotic genes, and thereby to produce plants having modified fertility characteristics. Alternatively, the floral-specific (or floral-enriched) expression conferred by the promoters of the disclosed floral homeotic genes may be employed to obtain corresponding expression of cytotoxic products. Such constructs will comprise the appropriate floral homeotic promoter sequence operably linked to a suitable open reading frame (discussed further below) and will be useful in genetic ablation approaches to engineering sterility in plants.

[0119] i. Modulation/Supplementation of Floral Homeotic Nucleic Acid Expression

[0120] The particular arrangement of the floral homeotic nucleic acid sequence in the transformation vector will be selected according to the type of expression of the sequence that is desired.

[0121] Enhanced expression of a floral homeotic nucleic acid may be achieved by operably linking the floral homeotic nucleic acid to a constitutive high-level promoter such as the CaMV 35S promoter. As noted below, modified activity of a floral homeotic polypeptide in planta may also be achieved by introducing into a plant a transformation vector containing a variant form of a floral homeotic nucleic acid, for example a form which varies from the exact nucleotide sequence of the disclosed floral homeotic nucleic acid.

[0122] A reduction in the activity of a floral homeotic polypeptide in the transgenic plant may be obtained by introducing into plants antisense constructs based on the floral homeotic nucleic acid sequence. For expression of antisense RNA, the floral homeotic nucleic acid is arranged in reverse orientation relative to the promoter sequence in the transformation vector. The introduced sequence need not be the full length floral homeotic nucleic acid, and need not be exactly homologous to the floral homeotic nucleic acid found in the plant type to be transformed. Generally, however, where the introduced sequence is of shorter length, a higher degree of homology to the native floral homeotic nucleic acid sequence will be needed for effective antisense suppression. Preferably, the introduced antisense sequence in the vector will be at least 30 nucleotides in length, and improved antisense suppression will typically be observed as the length of the antisense sequence increases. Preferably, the length of the antisense sequence in the vector will be greater than 100 nucleotides. Transcription of an antisense construct as described results in the production of RNA molecules that are the reverse complement of mRNA molecules transcribed from the endogenous floral homeotic gene in the plant cell. Although the exact mechanism by which antisense RNA molecules interfere with gene expression has not been elucidated, it is believed that antisense RNA molecules bind to the endogenous mRNA molecules and thereby inhibit translation of the endogenous mRNA.

[0123] Suppression of endogenous floral homeotic polypeptide activity can also be achieved using ribozymes. Ribozymes are synthetic RNA molecules that possess highly specific endoribonuclease activity. The production and use of ribozymes are disclosed in U.S. Pat. No. 4,987,071 to Cech and U.S. Pat. No. 5,543,508 to Haselhoff. The inclusion of ribozyme sequences within antisense RNAs may be used to confer RNA cleaving activity on the antisense RNA, such that endogenous mRNA molecules that bind to the antisense RNA are cleaved, which in turn leads to an enhanced antisense inhibition of endogenous gene expression.

[0124] Constructs in which the floral homeotic nucleic acid (or variants thereon) are over-expressed may also be used to obtain co-suppression of the endogenous floral homeotic nucleic acid gene in the manner described in U.S. Pat. No. 5,231,021 to Jorgensen. Such co-suppression (also termed sense suppression) does not require that the entire floral homeotic nucleic acid CDNA or gene be introduced into the plant cells, nor does it require that the introduced sequence be exactly identical to the endogenous floral homeotic nucleic acid gene. However, as with antisense suppression, the suppressive efficiency will be enhanced as (1) the introduced sequence is lengthened and (2) the sequence similarity between the introduced sequence and the endogenous floral homeotic nucleic acid gene is increased.

[0125] Constructs expressing an untranslatable form of the floral homeotic nucleic acid mRNA may also be used to suppress the expression of endogenous floral homeotic genes. Methods for producing such constructs are described in U.S. Pat. No. 5,583,021 to Dougherty et al. Preferably, such constructs are made by introducing a premature stop codon into the floral homeotic nucleic acid ORF.

[0126] Finally, dominant negative mutant forms of the disclosed sequences may be used to block endogenous floral homeotic polypeptide activity using approaches similar to that described by Mizukami et al. (1996). Such mutants require the production of mutated forms of the floral homeotic polypeptide that bind either to an endogenous binding target (for example, a nucleic acid sequence in the case of floral homeotic polypeptides, such as PTD, that function as transcription factors) or to a second polypeptide sequence (such as transcription co-factors), but do not function normally after such binding (i.e. do not function in the same manner as the non-mutated form of the polypeptide). By way of example, such dominant mutants can be constructed by deleting all or part of the C-terminal domain of a floral homeotic polypeptide, leaving an intact MADS domain. Polypeptides lacking all or part of the C-terminal region may bind to the appropriate DNA target, but are unable to interact with protein co-factors, thereby blocking transcription. Alternatively, dominant negative mutants may be produced by deleting all or part of the MADS domain, or all or part of the K-domain.

[0127] ii. Genetic Ablation

[0128] An alternative approach to modulating floral development is to specifically target a cytotoxic gene product to the floral tissues. This may be achieved by producing transgenic plants that express a cytotoxic gene product under the control of a floral-specific promoter, such as the promoter regions of PTLF, PTD, PTAG-1 and PTAG-2 as disclosed herein. The promoter regions of these gene sequences are generally contained within the first 150 base pairs of sequence upstream of the open reading frame, although floral-specific expression may be conferred by using smaller regions of this sequence. Thus, regions as small as the first 50 base pairs of sequence upstream of the open reading frame may be effective in conferring floral-specific expression. However, longer regions, such as at least 100, 150, 200 or 250 base pairs of the upstream sequences are preferred.

[0129] A number of known cytotoxic gene products may be expressed under the control of the disclosed promoter sequences of the floral homeotic genes. These include: RNases, such as barnase from Bacillus amyloliquefaciens and RNase-T1 from Aspergillus (Mariani et al., 1990; Mariani et al., 1992; Reynaerts et al., 1993); ADP-ribosyl-transferase (Diphtheria toxin A chain) (Pappenheimer, 1977; Thorness et al., 1991; Kandasamy et al., 1993); RolC from Agrobacterium rhizogenes (Schmulling et al., 1993); DTA (diphtheria toxin A) (Pappenheimer, 1977) and glucanase (Worrall et al., 1992).

[0130] d. Transformation and Regeneration Techniques

[0131] Constructs designed as discussed above to modulate or supplement expression of native floral homeotic genes in plants, or to express cytotoxins in a tissue-specific manner can be introduced into plants by a variety of means. Transformation and regeneration of both monocotyledonous and dicotyledonous plant cells is now routine, and the selection of the most appropriate transformation technique will be determined by the practitioner. The choice of method will vary with the type of plant to be transformed; those skilled in the art will recognize the suitability of particular methods for given plant types. Suitable methods may include, but are not limited to: electroporation of plant protoplasts; liposome-mediated transformation; polyethylene mediated transformation; transformation using viruses; micro-injection of plant cells; micro-projectile bombardment of plant cells; vacuum infiltration; and Agrobacterium tumefaciens (AT) mediated transformation. Typical procedures for transforming and regenerating plants are described in the patent documents listed at the beginning of this section.

[0132] Methods that are particularly suited to the transformation of woody species include (for Picea species) methods described in Ellis et al. (1991, 1993) and (for Populus species) the use of A. tumefaciens (Settler, 1993; Strauss et al., 1995a,b), A. rhizogenes (Han et al., 1996) and biolistics (McCown et al., 1991).

[0133] e. Selection of Transformed Plants

[0134] Following transformation and regeneration of plants with the transformation vector, transformed plants are preferably selected using a dominant selectable marker incorporated into the transformation vector. Typically, such a marker will confer antibiotic resistance on the seedlings of transformed plants, and selection of transformants can be accomplished by exposing the seedlings to appropriate concentrations of the antibiotic.

[0135] After transformed plants are selected and grown to maturity, the effect on fertility can be determined by visual inspection of floral morphology, including the determination of the production of pollen or ova. In addition, the effect on the activity of the endogenous floral homeotic gene may be directly determined by nucleic acid analysis (hybridization or PCR methodologies) or immunoassay of the expressed protein. Antisense or sense suppression of the endogenous floral homeotic gene may be detected by analyzing mRNA expression on Northern blots or by reverse transcription polymerase chain reaction (RT-PCR).

[0136] Example 3

Introduction of Antisense PTD cDNA into Hybrid Aspens

[0137] By way of example, the following methodology may be used to produce poplar trees with modified expression of PTD. The PTD cDNA (Seq. I.D. No. 2) is excised from the cloning vector and blunt ended using T4 DNA polymerase. The cDNA is then ligated into the Sma I site of pBI121 (Jefferson et al., 1987), and clones containing the cDNA in reverse orientation with respect to the promoter are identified by sequence analysis.

[0138] Hybrid aspens, such as the P. tremula x alba hybrid aspen and the P. tremula x tremuloides hybrid aspen are transformed with pDW151 (Weigel and Nilsson, 1995) and the above binary vectors using Agrobacterium tumefasciens strain C58 (Leple et al., 1992) with modifications as described by Han et al. (1996).

[0139] Expression of the antisense transgene is assessed in immature plants by extraction of mRNA and northern blotting using the PTD cDNA as a probe, or by RT-PCR. Levels of PTD protein are analyzed by extraction and concentration of cellular proteins followed by western blotting, or by in situ hybridization.

Example 4 Expression of Mutant PTAG-1 Sequences in Plants

[0140] PTAG-1 mutants are constructed by PCR amplification using standard PCR methodologies as described above and a Populus cDNA library as a template. A mutant form of PTAG-1 in which the MADS box domain is deleted is amplified using the following primer combination: 5′ GTCACTTTCTGCAAAAGGCGCAGTGGT 3′ (Seq. I.D. No. 21) 5′ AACTAACTGAAGGGCCATCTGATCTTG 3′ (Seq. I.D. No. 22)

[0141] A mutant form of PTAG-1 in which a portion of the 3′ region of the encoded polypeptide is deleted is amplified using the following primer combination: 5′ ATGGAATATCAAAATGAATCCCTTGAG 3′ (Seq. I.D. No. 23) 5′ ATTCATGCTCTGTCGCTTTCTTTCATTCT 3′ (Seq. I.D. No. 24)

[0142] The amplified products are cloned using standard cloning vectors and then ligated into a transformation vector such as pBI121 (Jefferson et al., 1987).

[0143] Hybrid aspens, such as the P. tremula x alba hybrid aspen and the P. tremula x tremuloides hybrid aspen are transformed with pDW151 (Weigel and Nilsson, 1995) and the pBI121 binary vector containing the mutant PTAG-1 construct using Agrobacterium tumefasciens strain C58 (Leple et al., 1992) with modifications as described by Han et al. (1996).

[0144] Expression of the mutant PTAG-1 transgenes is assessed in immature plants by extraction of mRNA and northern blotting using the PTAG-1 cDNA as a probe or by RT-PCR. Levels of mutant protein are analyzed by extraction and concentration of cellular proteins followed by western blotting, or by in situ hybridization.

Example 5 Production of Sequence Variants

[0145] As noted above, modification of the activity of floral homeotic polypeptides such as PTD, PTLF, PTAG-1 and PTAG-2 in plant cells can be achieved by transforming plants with a selected floral homeotic nucleic acid (cDNA or gene, or parts therof), antisense constructs based on the disclosed floral homeotic nucleic acid sequences or other variants on the disclosed sequences. Sequence variants include not only genetically engineered sequence variants, but also naturally occurring variants that arise within Populus populations, including allelic variants and polymorphisms, as well as variants that occur in different genotypes and species of Populus. These naturally occurring variants may be obtained by PCR amplification from genomic or cDNA libraries made from genetic material of Populus species, or by RT-PCR from mRNA from such species, or by other methods known in the art, including using the disclosed nucleic acids as probes to hybridize with genetic libraries. Methods and conditions for both direct PCR and RT-PCR are known in the art and are described in Innis et al. (1990).

[0146] As noted, variant DNA molecules also include those created by DNA genetic engineering techniques, for example, M13 primer mutagenesis. Details of these techniques are provided in Sambrook et al. (1989), Ch. 15. By the use of such techniques, variants may be created which differ in minor ways from the floral homeotic cDNA or gene sequences disclosed. DNA molecules and nucleotide sequences which are derived from the floral homeotic nucleic acids disclosed include DNA sequences which hybridize under stringent conditions to the DNA sequences disclosed, or fragments thereof.

[0147] Nucleic acid molecules and proteins that are variants of those disclosed herein may be identified by the degree of sequence identity that they share with a nucleic acid molecule or protein disclosed herein. Typically, such variants share at least 50% sequence identity with a disclosed nucleic acid or protein, as determined by the methods described above for homologs. Alternatively, for nucleic acid molecules, variants may be identified by their ability to hybridize to a disclosed sequence under stringent conditions, as described above.

[0148] The degeneracy of the genetic code further widens the scope of the present invention as it enables major variations in the nucleotide sequence of a DNA molecule while maintaining the amino acid sequence of the encoded protein. For example, the 32nd amino acid residue of the Poplar PTD protein shown in Seq. I.D. No. 4 is alanine. This is encoded in the Poplar PTD open reading frame by the nucleotide codon triplet GCC. Because of the degeneracy of the genetic code, three other nucleotide codon triplets: GCT, GCA and GCG, also code for alanine. Thus, the nucleotide sequence of the Poplar PTD ORF could be changed at this position to any of these three codons without affecting the amino acid composition of the encoded protein or the characteristics of the protein. Based upon the degeneracy of the genetic code, variant DNA molecules may be derived from the cDNA and gene sequences disclosed herein using standard DNA mutagenesis techniques as described above, or by synthesis of DNA sequences. Thus, this invention also encompasses nucleic acid sequences which encode a floral homeotic protein but which vary from the disclosed nucleic acid sequences by virtue of the degeneracy of the genetic code.

[0149] One skilled in the art will recognize that DNA mutagenesis techniques may be used not only to produce variant DNA molecules, but will also facilitate the production of proteins which differ in certain structural aspects from the Poplar floral homeotic proteins, yet which proteins are clearly derivative of these proteins. Newly derived proteins may also be selected in order to obtain variations on the characteristic of the Poplar floral homeotic proteins. Such derivatives include those with variations in amino acid sequence including minor deletions, additions and substitutions.

[0150] While the site for introducing an amino acid sequence variation is predetermined, the mutation per se need not be predetermined. For example, in order to optimize the performance of a mutation at a given site, random mutagenesis may be conducted at the target codon or region and the expressed protein variants screened for the optimal combination of desired activity. Techniques for making substitution mutations at predetermined sites in DNA having a known sequence as described above are well known.

[0151] Amino acid substitutions are typically of single residues; insertions usually will be on the order of about from 1 to 10 amino acid residues; and deletions will range about from 1 to 30 residues. Deletions or insertions preferably are made in adjacent pairs, i.e., a deletion of two residues or insertion of two residues. Substitutions, deletions, insertions or any combination thereof may be combined to arrive at a final construct. Obviously, the mutations that are made in the DNA encoding the protein must not place the sequence out of reading frame and preferably will not create complementary regions that could produce secondary mRNA structure.

[0152] Substitutional variants are those in which at least one residue in the amino acid sequence has been removed and a different residue inserted in its place. Such substitutions generally are made in accordance with the following Table 1 when it is desired to finely modulate the characteristics of the protein. Table 1 shows amino acids which may be substituted for an original amino acid in a protein and which are typically regarded as conservative substitutions. TABLE 1 Original Residue Conservative Substitutions Ala ser Arg lys Asn gln; his Asp glu Cys ser Gln asn Glu asp Gly pro His asn; gln Ile leu, val Leu ile; val Lys arg; gln; glu Met leu; ile Phe met; leu; tyr Ser thr Thr ser Trp tyr Tyr trp; phe Val ile; leu

[0153] Substantial changes in transcription factor function or other features are made by selecting substitutions that are less conservative than those in Table 1, i.e., selecting residues that differ more significantly in their effect on maintaining (a) the structure of the polypeptide backbone in the area of the substitution, for example, as a sheet or helical conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain. The substitutions which in general are expected to produce the greatest changes in protein properties will be those in which (a) a hydrophilic residue, e.g., seryl or threonyl, is substituted for (or by) a hydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue having an electropositive side chain, e.g., lysyl, arginyl, or histadyl, is substituted for (or by) an electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side chain, e.g., phenylalanine, is substituted for (or by) one not having a side chain, e.g., glycine.

[0154] Homologous polypeptides that share at least 50% amino acid sequence identity to the disclosed PTD, PTLF, PTAG-1 or PTAG-2 amino acid sequences as determined using BLAST 2.0, gapped blastp, with default parameters, are encompassed by this invention. Homologs with even greater similarity to the reference sequences will show increasing percentage identities when assessed by this method, such as at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 90% or at least 95% sequence identity. Such homologous peptides are preferably at least 10 amino acids in length, and more preferably at least 25 or 50 amino acids in length. When less than the entire sequence is being compared for sequence identity, homologs will typically possess at least 75% sequence identity over short windows of 10-20 amino acids, and may possess sequence identities of at least 85% or at least 90% or 95% depending on their similarity to the reference sequence. Also encompassed by the present invention are the nucleic acid sequences that encode these homologous peptides.

[0155] Similarly, homologous nucleic acids that share at least 50% nucleotide identity to the disclosed PTD, PTLF, PTAG-1 or PTAG-2 nucleic acid sequences as determined using BLAST 2.0, gapped blastn, with default parameters, are encompassed by this invention. Homologs with even greater similarity to the reference sequences will show increasing percentage identities when assessed by this method, such as at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 90% or at least 95% sequence identity. Such homologous nucleic acids are preferably at least 50 nucleotides on length, and more preferably at least 100 or 250 nucleotides in length.

Example 6 Other Applications of the Disclosed Sequences

[0156] The disclosed floral homeotic nucleic acids and polypeptide are useful as laboratory reagents to study and analyze floral gene expression in plants, including plants engineered for modified fertility characteristics. For example, probes and primers derived from the PTD sequence, as well as monoclonal antibodies specific for the PTD polypeptide may be used to detect and quantify expression of PTD in seedlings transformed with an antisense PTD construct as described above. Such analyses would facilitate detection of those transformants that display modified PTD expression and which may therefore be good candidates for having modified fertility characteristics.

[0157] The production of probes and primers derived from the disclosed sequences is described in detail above. Production of monoclonal antibodies requires that all or part of the protein against which the antibodies to be raised be purified. With the provision herein of the floral homeotic nucleic acid sequences, as well as the sequences of the encoded polypeptides, this may be achieved by expression in heterologous expression systems, or chemical synthesis of peptide fragments.

[0158] Many different expression systems are available for expressing cloned nucleic acid molecules. Examples of prokaryotic and eukaryotic expression systems that are routinely used in laboratories are described in Chapters 16-17 of Sambrook et al. (1989). Such systems may be used to express the floral homeotic polypeptides at high levels to faciliate purification.

[0159] By way of example only, high level expression of a floral homeotic polypeptide may be achieved by cloning and expressing the selected cDNA in yeast cells using the pYES2 yeast expression vector (Invitrogen, San Diego, Calif.). Secretion of the recombinant floral homeotic polypeptide from the yeast cells may be achieved by placing a yeast signal sequence adjacent to the floral homeotic nucleic acid coding region. A number of yeast signal sequences have been characterized, including the signal sequence for yeast invertase. This sequence has been successfully used to direct the secretion of heterologous proteins from yeast cells, including such proteins as human interferon (Chang et al., 1986), human lactoferrin (Liang and Richardson, 1993) and prochymosin (Smith et al., 1985). Alternatively, the enzyme may be expressed at high level in prokaryotic expression systems, such as E. coli.

[0160] Monoclonal or polyclonal antibodies may be produced to the selected floral homeotic polypeptide or portions thereof. Optimally, antibodies raised against a specified floral homeotic polypeptide will specifically detect that polypeptide. That is, for example, antibodies raised against the PTD polypeptide would recognize and bind the PTD polypeptide and would not substantially recognize or bind to other proteins found in poplar cells. The determination that an antibody specifically detects PTD is made by any one of a number of standard immunoassay methods; for instance, the Western blotting technique (Sambrook et al., 1989). To determine that a given antibody preparation (such as one produced in a mouse against PTD) specifically detects PTD by Western blotting, total cellular protein is extracted from poplar cells and electrophoresed on a sodium dodecyl sulfate-polyacrylamide gel. The proteins are then transferred to a membrane (for example, nitrocellulose) by Western blotting, and the antibody preparation is incubated with the membrane. After washing the membrane to remove non-specifically bound antibodies, the presence of specifically bound antibodies is detected by the use of an anti-mouse antibody conjugated to an enzyme such as alkaline phosphatase; application of the substrate 5-bromo-4-chloro-3-indolyl phosphate/nitro blue tetrazolium results in the production of a dense blue compound by immuno-localized alkaline phosphatase. Antibodies which specifically detect PTD will, by this technique, be shown to bind to substantially only the PTD band (which will be localized at a given position on the gel determined by its molecular weight). Non-specific binding of the antibody to other proteins may occur and may be detectable as a weak signal on the Western blot. The non-specific nature of this binding will be recognized by one skilled in the art by the weak signal obtained on the Western blot relative to the strong primary signal arising from the specific antibody-PTD binding.

[0161] Substantially pure floral homeotic polypeptides suitable for use as an immunogen may be isolated from transformed cells as described above. Concentration of protein in the final preparation is adjusted, for example, by concentration on an Amicon filter device, to the level of a few micrograms per milliliter. Alternatively, peptide fragments of the specified floral homeotic polypeptide may be utilized as immunogens. Such fragments may be chemically synthesized using standard methods, or may be obtained by cleavage of the whole floral homeotic polypeptide followed by purification of the desired peptide fragments. Peptides as short as 3 or 4 amino acids in length are immunogenic when presented to the immune system in the context of a Major Histocompatibility Complex (MHC) molecule, such as MHC class I or MHC class II. Accordingly, peptides comprising at least 3 and pereferably at least 4, 5, 6 or 10 or more consecutive amino acids of the disclosed floral homeotic polypeptide amino acid sequences may be employed as immuogens to raise antibodies. Because naturally occurring epitopes on proteins are frequently comprised of amino acid residues that are not adjacently arranged in the peptide when the peptide sequence is viewed as a linear molecule, it may be advantageous to utilize longer peptide fragments from the floral homeotic polypeptide amino acid sequences in order to raise antibodies. Thus, for example, peptides that comprise at least 10, 15, 20, 25 or 30 consecutive amino acid residues of the floral homeotic polypeptide amino acid sequence may be employed. Monoclonal or polyclonal antibodies to the intact floral homeotic polypeptide or peptide fragments of this protein may be prepared as described below.

[0162] Monoclonal antibody to epitopes of the selected floral homeotic polypeptide can be prepared from murine hybridomas according to the classical method of Kohler and Milstein (1975) or derivative methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein over a period of a few weeks. The mouse is then sacrificed, and the antibody-producing cells of the spleen isolated. The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells destroyed by growth of the system on selective media comprising aminopterin (HAT media). The successfully fused cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued. Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay procedures, such as ELISA, as originally described by Engvall (1980), and derivative methods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. Detailed procedures for monoclonal antibody production are described in Harlow and Lane (1988).

[0163] Having illustrated and described the principles of isolating the Populus floral homeotic genes, the proteins encoded by these genes and modes of use of these biological molecules, it should be apparent to one skilled in the art that the invention can be modified in arrangement and detail without departing from such principles. We claim all modifications coming within the spirit and scope of the claims presented herein.

References

[0164] Ainley et al. (1993). Regulatable endogenous production of cytokinins up to “toxic” levels in transgenic plants and plant tissues. Plant Mol. Biol. 22:13-23.

[0165] Altschul et al. (1990). J. Mol. Biol. 215:403-410.

[0166] Altschul et al. (1994). Nature Genetics 6:119-129.

[0167] An et al. (1988). Plant Physiol. 88: 547.

[0168] Ausubel et al. (1987). In: Current Protocals in Molecular Biology, Greene Publishing Associates and Wiley-Intersciences.

[0169] Baker et al. (1990). RNA and DNA isolation from recalcitrant plant tissues. Bio/Techniques 9:268-272.

[0170] Benfey and Chua (1990). The cauliflower mosaic virus 35S promoter: Combinatorial regulationof transcription in plants. Science 250:959-966.

[0171] Boes and Strauss (1994). Floral phenology and morphology of black cottonwood, Populus trichocarpa (Salicaceae). Am. J. Bot. 8:562-567.

[0172] Brayshaw (1965). The status of the black cottonwood (Populus trichocarpa Torreyand Gray). Can. Field Nat. 79:91-95.

[0173] Brusslan et al. (1993). An Arabidopsis mutant with a reduced level of cab140 RNA is aresult of cosuppression. Plant Cell 5:667-677.

[0174] Bustos et al. (1989). Plant Cell 1: 839.

[0175] Callis et al. (1988). Plant Physiol. 88: 965.

[0176] Carpenter et al. (1992). Preferential expression of an α-tubulin gene of Arabidopsis in pollen. Plant Cell 4:557-571.

[0177] Chang et al. (1986). Saccharomyces cerevisiae secretes and correctly processes human interferon hybrid protein containing yeast invertase signal peptides. Mol. and Cell. Biol. 6:1812-1819.

[0178] Cheng et al. (1983). Organ initiation and the development of unisexual flowers in the tassel and ear of Zea mays. Am. J. Bot. 70:450-462.

[0179] Coen et al. (1990). FLORICAULA: a homeotic gene required for flower development in Antirrhinum majus. Cell 63:1311-1322.

[0180] Corpet et al. (1988). Nucleic Acids Research 16:10881-10890.

[0181] Dekeyser et al. (1990). Plant Cell 2:591.

[0182] Denis et al. (1993). Expression of engineered nuclear male sterility in Brassica napus. Plant Physiol. 101:1295-1304.

[0183] Don et al. (1991). “Touchdown” PCR to circumvent spurious priming during gene amplification. Nucl. Acids Res. 19:4008.

[0184] Ellis et al. (1991). Plant. Mol. Biol. 17:19-27.

[0185] Ellis et al. (1993). Bio/Technology 11:84-89.

[0186] Engvall (1980). Enzymol. 70:419.

[0187] Flavell (1994). Inactivation of gene expression in plants as a consequence of specific sequence duplication. Proc Natl Acad Sci USA 91:3490-3496.

[0188] Fromm et al. (1989). Plant Cell 1:977.

[0189] Gan and Amasino (1995). Inhibition of leaf senescence by autoregulated production of cytokinin. Science 270:1986-1988.

[0190] Gatz (1997). Chemical control of gene expression. Ann. Rev. Plant Physiol. Plant Mol. Biol. 48:89-108.

[0191] Gelvin et al. (1990). Plant Molecular Biology Manual, Kluwer Academic Publishers.

[0192] Gilmartin et al. (1992). Characterization of a gene encoding a DNA binding protein with specificity for a light-responsive element. Plant Cell 4:839-949.

[0193] Goldman et al. (1994). Female sterile tobacco plants are produced by stigma-specific cell ablation. EMBO J. 13:2976-2984.

[0194] Grant et al. (1994). Developmental differences between male and female flowers in the dioecious plant Silene latifolia. Plant J. 6:471-480.

[0195] Han et al. (1996). Cellular and molecular biology of Agrobacterium-mediated transformation of plants and its application to genetic transformation of Populus. In: Stettler et al. [eds.] Biology of Populus and its Implications for Management and Conservation, Part I, Chapter 9, pp. 201-222, NRC Research Press, Nat. Res. Coun. of Canada, Ottawa, Ontario.

[0196] Harlow and Lane (1988). Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, New York.

[0197] Henikoff (1984). Unidirectional digestion with exonuclease III creates targeted breakpoints for DNA sequencing. Gene 28:351-359.

[0198] Higgins and Sharp (1988). Gene 73: 237-244.

[0199] Higgins and Sharp (1989). CABIOS 5:151-153.

[0200] Huang et al. (1992). Computer Applications in the Biosciences 8:155-165.

[0201] Innis et al. (1990). PCR Protocols, A Guide to Methods and Applications, Innis et al. [eds.], Academic Press, Inc., San Diego, Calif.

[0202] Jefferson et al. (1987). GUS fusions: β-glucuronidase as a sensitive and versatile gene fusion marker in higher plants. EMBO J. 6:3901-3907.

[0203] Jorgensen (1992). Silencing of plant genes by homologous transgenes. AgBiotech News Info 4:265N-273N.

[0204] Kandasamy et al. (1993). Ablation of papillar cell function in Brassica flowers results in the loss of stigma receptivity to pollination. Plant Cell 5:263-275.

[0205] Kaul (1995). Reproductive structure and organogenesis in a cottonwood, Populus deltoides (Salicaceae). Int. J. Plant Sci. 156:172-180.

[0206] Kawasaki et al. (1990). In: PCR Protocals, A Guide to Methods and Applications, Innis et al. [eds.], pp. 21-27, Academic Press, Inc., San Diego, Calif.

[0207] Kelly et al. (1995) NFL, the tobacco homolog of FLORICAULA and LEAFY, is transcriptionally expressed in both vegetative and floral meristems. Plant Cell 7:225-34.

[0208] Kohler and Milstein (1975). Nature 256:495.

[0209] Kooter (1993). Mol JNM: Trans-inactivation of gene expression of plants. Curr Opin Biotechnol 4:166-171.

[0210] Kuhlemeier et al. (1989). Plant Cell 1:471.

[0211] Leple et al. (1992). Transgenic poplars: Expression of chimeric genes using four different constructs. Plant Cell Rep. 11:137-41.

[0212] Liang and Richardson (1993). Expression and characterization of human lactoferrin in yeast (Saccharomyces cerevisiae). J. Agric. Food Chem. 41:1800-1807.

[0213] Ma (1994). The unfolding drama of flower development: Recent results from genetic and molecular analyses. Genes Dev. 8:745-756.

[0214] Ma et al. (1991). AGL1-AGL6, an Arabidopsis gene family with similarity to floral homeotic and transcription factor genes. Genes & Dev. 5:484-495.

[0215] Mandel et al. (1992a). Manipulation of flower structure in transgenic tobacco. Cell 71:133-143.

[0216] Mandel et al. (1 992b). Molecular characterization of the Arabidopsis floral homeotic gene APETALA1. Nature 360: 273-277.

[0217] Marcotte et al. (1989). Plant Cell 1:969.

[0218] Mariani et al. (1990). Induction of male sterility in plants by a chimaeric ribonucleae gene. Nature 347:737-741.

[0219] Mariani et al. (1992). A chimaeric ribonuclease-inhibitor gene restores fertility to male-sterile plants. Nature 357:384-387.

[0220] Matzke et al. (1993). Genomic imprinting in plants: parental effects and trans-inactivation phenomena. Annu. Rev. Plant Physiol. Plant Mol. Biol. 44:53-76.

[0221] McCown et al. (1991). Stable transformation of Populus and incorporation of pest resistance by electric discharge particle acceleration. Plant Cell Rep. 9:590-594.

[0222] Mizukami et al. (1996). Plant Cell 8:831-845

[0223] Mol et al. (1994). Post-transcriptional inhibition of gene expression: Sense and antisense genes. In: Paszkowski J (ed.) Homologous Recombination and Gene Silencing in Plants, pp. 309-334, Kluwer Academic Publishers, Dordrecht.

[0224] Nagaraj (1952). Floral morphology of Populus deltoides and P. tremuloides. Bot. Gaz. 114:222-243.

[0225] Needleman and Wunsch (1970). J. Mol. Biol. 48:443.

[0226] Odell et al. (1994). Seed specific gene activation mediated by the Cre/lox site-specific recombination system. Plant Physiol. 106:447-458.

[0227] Okamuro et al. (1993). Regulation of Arabidopsis flower development. Plant Cell 5:1183-93.

[0228] Opperman et al. (1993). Root knot nematode directed expression of a plant root specific gene. Science 263:221-223.

[0229] Pappenheimer (1977). Diphtheria toxin. Annu. Rev. Biochem. 46:69-94.

[0230] Paul et al. (1992). The isolation and characterization o the tapetum-specific Arabidopsis thaliana A9 gene. Plant Mol Biol 19:611-622.

[0231] Pearson and Lipman (1988). Proc. Natl. Acad. Sci. USA 85:2444.

[0232] Pearson et al. (1994). Methods in Molecular Biology 24:307-331.

[0233] Pnueli et al. (1991). Plant J. 1:255-266.

[0234] Pnueli et al. (1994). Isolation of the tomato Agamous gene TAG1 and analysis of its homeotic role in transgenic plants. Plant Cell 6:163-173.

[0235] Pouwels et al. (1987). Cloning Vectors: A Laboratory Manual, 1985 supplement.

[0236] Purugganan et al. (1995). Molecular evolution of flower development: Diversification of the plant MADS-box regulatory gene family. Genetics 140:345-56.

[0237] Reynaerts et al. (1993). Engineered genes for fertility control and their application in hybrid seed production. Sci. Hortic 55:125-139.

[0238] Riechmann et al. (1996). DNA binding properties of Arabidopsis MADS domain homeotic proteins APETELA1, APETELA3, PISTILLATA and AGAMOUS. Nuc. Acid. Res. 24(16): 3134-3141.

[0239] Roshal et al. (1987). EMBO J. 6:1155.

[0240] Sambrook et al. (1989). Molecular Cloning: A laboratory manual,. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.

[0241] Schaffner and Sheen (1991). Plant Cell 3:997.

[0242] Schernthaner et al. (1988). EMBO J. 7:1249.

[0243] Schmüilling et al. (1993). Resoration of fertility by antisense RNA in genetically engineered male sterile tobacco plants. Mol. Gen. Genet.237-385-394.

[0244] Schwarz-Sommer et al. (1992). EMBO J. 11: 251-263.

[0245] Sheppard (1997). PTD: a Populus trichocarpa gene with homology to floral homeotic transcription factors. Ph.D. dissertation. Oregon State University.

[0246] Siebertz et al. (1989). Plant Cell 1:961.

[0247] Smith and Waterman (1981). Adv. Appl. Math. 2:482.

[0248] Smith et al. (1985). Heterologous protein secretion from yeast. Science 229:1219-1224.

[0249] Stettler (1993). Popular Molecular Network Newsletter 1(1), College of Forest Resources AR-10, University of Washington, Seattle, Wash.

[0250] Stockhause et al. (1997). The promoter of the gene encoding the C₄ Flaveria spp. Plant Cell 9:479-489.

[0251] Strauss et al. (1995a). Molecular Breeding 1:5-26.

[0252] Strauss et al. (1995b). TGERC Annual Report: 1994-1995. Forest Research Laboratory, Oregon State University.

[0253] Taylor et al. (1992). Conditional male-fertility in chalcone synthase-deficient petunia. J. Hered. 83:11-17.

[0254] Terada and Shimamoto (1990). Mol. Gen. Genet. 220:389.

[0255] Thorsness et al. (1991). A Brassica S-locus gene promoter targets toxic gene expression and cell death to the pistil and pollen of transgenic Nicotiana. Devel. Biol. 143:173-184.

[0256] Thorsness et al. (1993). Genetic ablation of floral cells in Arabiodopsis. Plant Cell 5:253-61.

[0257] Tijssen (1993). Overview of principles of hybridization and the strategy of nucleic acid probe assays. In: Laboratory Techniques in Biochemistry and Molecular Biology -Hybridization with Nucleic Acid Probes, Part I, Chapter 2. Elsevier, N.Y.

[0258] Van der Meer et al. (1992). Antisense inhibition of flavanoid biosynthesis sin petunia anthers results in male sterility. Plant Cell 4:253-262.

[0259] Wagner et al. (1987). Chloroplast DNA polymorphisms in lodgepole pine and their hybrids. Proc. Natl. Acad. Sci. USA 84:2097-2100.

[0260] Weigel et al. (1992). LEAFY controls floral meristem identity in Arabidopsis. Cell 69:843-59.

[0261] Weigel and Nilsson (1995). A developmental switch sufficient for flower initiation in diverse plants. Nature 377: 495-500.

[0262] Weissbach and Weissbach (1989). Methods for Plant Molecular Biology, Academic Press.

[0263] Worrall et al. (1992). Premature dissolution of the microporocyte callose wall causes male sterility in transgenic tobacco. Plant Cell 4:759-771.

[0264] Yanofsky (1995). Floral meristems to floral organs: Genes controlling early events in Arabidopsis flower development. Annu. Rev. Plant Physiol. 46:167-188.

[0265] Yanofsky et al. (1990). The protein encoded by the Arabidopsis homeotic gene AGAMOUS resembles transcription factors. Nature 346: 35-39.

1 24 1 4285 DNA Populus balsamifera subsp. trichocarpa 1 aagcttgtca gacccaacaa atatggacct gatatgcttg tcataccaac ttaaactcga 60 gtcagatata tttaatatta ttattcatat tattaatata attattaata aatttgaaaa 120 aatattatta ctatcgaaaa aaactataaa tttgatttga atgataaaat taaaaattaa 180 aaatattttt attatcatta ttgttaatat aattattaaa aaatttaaaa aatatattat 240 tactattgag aaaaaccata actgtttatg cgacatttta tgtcatggaa aatgagctga 300 aaaaaaccaa taaaaagaaa aaaactaatg aaaaaaaaga aaaaaaaata tgaattaact 360 gggttaaccc ttgaaaccag gttaccccgt caaaccttgg attcgtgtcg tgaaagtttg 420 ttaactaaat agaaaaaaaa aattgacggg ttacccagaa ttaactgggc taacccgtca 480 aaccaggtta cctatcaaac ccgggatccg tgtcatgaaa gtttgataac taaatagaaa 540 acaattgaac attaacaacc taaattaaac gaaaaaaatt aattaaaaac aagaaaacaa 600 aaacaaacaa aaaacataag catgttagta atgaggaaaa agaaaaaaaa tttgattcaa 660 ctgagttaac ccgtcaaacc cgggattcgc gtcatgaaag tttgataact aaatagaaaa 720 aaaaatcgac gggttaaacg aaaaaaaatt aacaaactaa actaaacaaa aaaaaattga 780 ttaaaaagga aaaaagcaaa aaaaataatt tcgggttaac tcatcaaacc aggttaaccc 840 gtcaaacccg agatccgtgt catgaaagtc tgataactaa ataatttttt ttttcacatt 900 aacaaactaa attaaacaaa aaaaattcat taaaagaaaa aaaaacacaa agaaaaaagc 960 aaaaaaaaac ctataatagc ataaataaat aaataaaaac aggaaaaaaa ttttttaaaa 1020 aaaacctttc aatcactaat acatagaagg tgtggggaaa gccacagtga tttccccgta 1080 ccttttaaag tattacttaa tatataggtg aatttaattg accgtcacga aaaagactat 1140 tctggcttcc tcttacaatg gacgctatct aaattcaaat actttgaaaa aagatttaat 1200 cctgtaacct tctttcgttt ttttatgcct tcaatccatc tatttattgt ttttatgatt 1260 tttcttagat acaaaagagc atattttaaa gaagaaaaaa ataagctaag cacctcaagt 1320 tttgattttt tttttatttt gcagccaatt ttttaaatat taaaattttc ataatagatc 1380 aaaggataat tcaaaattgc atccaaataa caacattagt aatggaagga cttatggtat 1440 gaatggatca ataatataag ggctgaatta acaacatttt ttttatttag atcctgttta 1500 tttttacgtt ttaaaaatat ttttgaaatt attttatttt ttattataaa ttaatatttt 1560 tagatcattt taatacgtta atataaaaaa taattttttt aaaaaaattt attttaatat 1620 attttttaaa aataatattt aaaaaaacaa tcataacaat attctcatta cctaacacag 1680 tcatggaaca ggaatgagaa aaggtcttat cagtaaattg cttgcatgtc atgtcaaggt 1740 gtatgaacct cccaatactt ctcacgctac ccttcagaaa tccaatctca gaagccacag 1800 acaatctaag ttacgctaca atcaactttc catcaccctt tccttattta gaaactccac 1860 ttaatcacat ttcacccttt ttcatcatct tctctttccc ttcaagaagc ctaggtactg 1920 tgcaagaaac ccttatctct ccccctcagt atttactttt gtttagtgct acagctttca 1980 caaagaagta aggaaaaaat atgggtcgtg gaaagattga aatcaagaag atcgaaaacc 2040 ccacaaacag gcaagtcacc tactcgaaga gaagaaatgg tattttcaag aaagcccaag 2100 aactcactgt actttgtgat gctaaggtct ctcttatcat gttctccaac actaacaaac 2160 tcaatgagta cattagcccc tccacatcgt acgtatactc gtatcatgtt tctggctaag 2220 tatttcttcc gtgctttctc ttctttcttt cttttcttgt cttttatgtt gcagttttat 2280 gaaaccttgg taatggaacc gtagttttta ttgttaatta tgaccaggac aaagaagatc 2340 tacgatcaat atcagaacgc tttaggcata gatctgtggg gcactcaata cgaggttaac 2400 ctttcttttc tgtctttctt ctaatgtttg atctatagga cgaatatgag attcttcaaa 2460 ggattttgtt tgtgaggttt gcagaaaatg caagagcact tgaggaagct gaatgatatc 2520 aatcataagc tgagacaaga aatcaggtaa cttcaaaaga aataaccttc gcatatatgc 2580 atgtggttat ggtttttatg ggaatatctg taaatttgtg gagctactaa ttaaggtatt 2640 tgtttttaac aggcagagga gaggagaggg cctgaatgat ctgagcattg atcatctgcg 2700 cggtcttgag caacatatga ctgaagcctt gaatggtgtg cgtggcagga aggtcagatg 2760 ttttcaagtg aacatcttta tataattatc aagttctaat tcctaaaatt tgagcttact 2820 agtaatttga gttcggtccg gtgtatcaag caggttaatc tagatctagt tttttttcct 2880 taccaaatca aagtcatttt gaggattttt taataaaaaa tattgaattt tgaatcaact 2940 tatacaaatt catcaatcta caactcgaat cttacattta atcaaacttt caaattagat 3000 cttataaata tgatattaac cggtcggtgt tttatgtaca ttaatattat gttttagttg 3060 aactctttta tcattttttt tttttaaatt tgagttattt taatccttat caatttttat 3120 cattttggga ttcttggaaa ccctggttag aaagaaaata cacacccttg aacttgtgct 3180 tctttacctt tgcattatgg attttcatga actggatttt gggtaaccct taacctcatc 3240 tatagaaggg atatgccttg taattaacac tttacactta caagttcaac attctttgat 3300 tatttacagt accatgtgat caaaacacaa aacgaaacct acaggaagaa ggttagtgat 3360 aaaaagaaca ttttacctct tcaatttcat gcatgtagct tttggaacaa attctctggc 3420 gattaattgc aggtgaagaa tttagaggag agacatggaa acctcttgat ggaatatgta 3480 agaatctaaa ttttcatgtg cttgttttcg ctaattttcc aacttggaaa aacacatgga 3540 ttaaacctga gatttttttt ttcttttgtg ctttgggatt taaggaagca aaactagagg 3600 atcgacagta tggtttagtg gacaatgaag ctgctgttgc acttgcaaat ggggcttcca 3660 acctctatgc attccgcctg catcacgggc acaaccacca ccaccatctc cctaatcttc 3720 accttggaga tggatttgga gcccatgaac ttcgccttcc ttgagtggtg cttgaggtcg 3780 accttccagc tcttcagaca tcttatctaa atgcgtgtgc taactagaga tgctatctaa 3840 tattatttaa taattaatta agagcccgga agtaaaaaat actttcatag attgtaattt 3900 acctcagggt aatgtgtatg gcagcatatt agattgtgat ttgagcaagg aatgtcattc 3960 cttatggatt aattaaatat aaaagctctt tttcacaaat ataattccac ttggagtagc 4020 attctgcaat atcccatatg atctgcaggc ttaataatta tatgattgaa atgtgttgga 4080 tcaaccgtca tatgtatgta tgtatgtatg tatgtatacg tatgtgtata ctagggagtc 4140 aacaacacag ggggtgtaag caccaaatgc attatccact gtttttgccc aaaccccatt 4200 tggcataggt cgacaatacc ataccaatgc ctccgaagcc atccttcccc gccgccctac 4260 acaaaccaaa accgctgaat tcctg 4285 2 946 DNA Populus balsamifera subsp. trichocarpa CDS (1)..(684) 2 atg ggt cgt gga aag att gaa atc aag aag atc gaa aac ccc aca aac 48 Met Gly Arg Gly Lys Ile Glu Ile Lys Lys Ile Glu Asn Pro Thr Asn 1 5 10 15 agg caa gtc acc tac tcg aag aga aga aat ggt att ttc aag aaa gcc 96 Arg Gln Val Thr Tyr Ser Lys Arg Arg Asn Gly Ile Phe Lys Lys Ala 20 25 30 caa gaa ctc act gta ctt tgt gat gct aag gtc tct ctt atc atg ttc 144 Gln Glu Leu Thr Val Leu Cys Asp Ala Lys Val Ser Leu Ile Met Phe 35 40 45 tcc aac act aac aaa ctc aat gag tac att agc ccc tcc aca tcg aca 192 Ser Asn Thr Asn Lys Leu Asn Glu Tyr Ile Ser Pro Ser Thr Ser Thr 50 55 60 aag aag atc tac gat caa tat cag aac gct tta ggc ata gat ctg tgg 240 Lys Lys Ile Tyr Asp Gln Tyr Gln Asn Ala Leu Gly Ile Asp Leu Trp 65 70 75 80 ggc act caa tac gag aaa atg caa gag cac ttg agg aag ctg aat gat 288 Gly Thr Gln Tyr Glu Lys Met Gln Glu His Leu Arg Lys Leu Asn Asp 85 90 95 atc aat cat aag ctg aga caa gaa atc agg cag agg aga gga gag ggc 336 Ile Asn His Lys Leu Arg Gln Glu Ile Arg Gln Arg Arg Gly Glu Gly 100 105 110 ctg aat gat ctg agc att gat cat ctg cgc ggt ctt gag caa cat atg 384 Leu Asn Asp Leu Ser Ile Asp His Leu Arg Gly Leu Glu Gln His Met 115 120 125 act gaa gcc ttg aat ggt gtg cgt ggc agg aag tac cat gtg atc aaa 432 Thr Glu Ala Leu Asn Gly Val Arg Gly Arg Lys Tyr His Val Ile Lys 130 135 140 aca caa aac gaa acc tac agg aag aag gtg aag aat tta gag gag aga 480 Thr Gln Asn Glu Thr Tyr Arg Lys Lys Val Lys Asn Leu Glu Glu Arg 145 150 155 160 cat gga aac ctc ttg atg gaa tat gaa gca aaa cta gag gat cga cag 528 His Gly Asn Leu Leu Met Glu Tyr Glu Ala Lys Leu Glu Asp Arg Gln 165 170 175 tat ggt tta gtg gac aat gaa gct gct gtt gca ctt gca aat ggg gct 576 Tyr Gly Leu Val Asp Asn Glu Ala Ala Val Ala Leu Ala Asn Gly Ala 180 185 190 tcc aac ctc tat gca ttc cgc ctg cat cac ggg cac aac cac cac cac 624 Ser Asn Leu Tyr Ala Phe Arg Leu His His Gly His Asn His His His 195 200 205 cat ctc cct aat ctt cac ctt gga gat gga ttt gga gcc cat gaa ctt 672 His Leu Pro Asn Leu His Leu Gly Asp Gly Phe Gly Ala His Glu Leu 210 215 220 cgc ctt cct tga gtggtgcttg aggtcgacct tccagctctt cagacatctt 724 Arg Leu Pro 225 atctaaatgc gtgtgctaac tagagatgct atctaatatt atttaataat taattaagag 784 cccggaagta aaaaatactt ccatagattg taatttacct cagggtaatg tgtatggcag 844 catattagat tgtgatttga gcaaggaatg tcattcctta tggattaatt aaatataaaa 904 gctctttttc acaaataaaa aaaaaaaaaa aaaaaaaaaa aa 946 3 681 DNA Populus balsamifera subsp. trichocarpa CDS (1)..(681) 3 atg ggt cgt gga aag att gaa atc aag aag atc gaa aac ccc aca aac 48 Met Gly Arg Gly Lys Ile Glu Ile Lys Lys Ile Glu Asn Pro Thr Asn 1 5 10 15 agg caa gtc acc tac tcg aag aga aga aat ggt att ttc aag aaa gcc 96 Arg Gln Val Thr Tyr Ser Lys Arg Arg Asn Gly Ile Phe Lys Lys Ala 20 25 30 caa gaa ctc act gta ctt tgt gat gct aag gtc tct ctt atc atg ttc 144 Gln Glu Leu Thr Val Leu Cys Asp Ala Lys Val Ser Leu Ile Met Phe 35 40 45 tcc aac act aac aaa ctc aat gag tac att agc ccc tcc aca tcg aca 192 Ser Asn Thr Asn Lys Leu Asn Glu Tyr Ile Ser Pro Ser Thr Ser Thr 50 55 60 aag aag atc tac gat caa tat cag aac gct tta ggc ata gat ctg tgg 240 Lys Lys Ile Tyr Asp Gln Tyr Gln Asn Ala Leu Gly Ile Asp Leu Trp 65 70 75 80 ggc act caa tac gag aaa atg caa gag cac ttg agg aag ctg aat gat 288 Gly Thr Gln Tyr Glu Lys Met Gln Glu His Leu Arg Lys Leu Asn Asp 85 90 95 atc aat cat aag ctg aga caa gaa atc agg cag agg aga gga gag ggc 336 Ile Asn His Lys Leu Arg Gln Glu Ile Arg Gln Arg Arg Gly Glu Gly 100 105 110 ctg aat gat ctg agc att gat cat ctg cgc ggt ctt gag caa cat atg 384 Leu Asn Asp Leu Ser Ile Asp His Leu Arg Gly Leu Glu Gln His Met 115 120 125 act gaa gcc ttg aat ggt gtg cgt ggc agg aag tac cat gtg atc aaa 432 Thr Glu Ala Leu Asn Gly Val Arg Gly Arg Lys Tyr His Val Ile Lys 130 135 140 aca caa aac gaa acc tac agg aag aag gtg aag aat tta gag gag aga 480 Thr Gln Asn Glu Thr Tyr Arg Lys Lys Val Lys Asn Leu Glu Glu Arg 145 150 155 160 cat gga aac ctc ttg atg gaa tat gaa gca aaa cta gag gat cga cag 528 His Gly Asn Leu Leu Met Glu Tyr Glu Ala Lys Leu Glu Asp Arg Gln 165 170 175 tat ggt tta gtg gac aat gaa gct gct gtt gca ctt gca aat ggg gct 576 Tyr Gly Leu Val Asp Asn Glu Ala Ala Val Ala Leu Ala Asn Gly Ala 180 185 190 tcc aac ctc tat gca ttc cgc ctg cat cac ggg cac aac cac cac cac 624 Ser Asn Leu Tyr Ala Phe Arg Leu His His Gly His Asn His His His 195 200 205 cat ctc cct aat ctt cac ctt gga gat gga ttt gga gcc cat gaa ctt 672 His Leu Pro Asn Leu His Leu Gly Asp Gly Phe Gly Ala His Glu Leu 210 215 220 cgc ctt cct 681 Arg Leu Pro 225 4 227 PRT Populus balsamifera subsp. trichocarpa 4 Met Gly Arg Gly Lys Ile Glu Ile Lys Lys Ile Glu Asn Pro Thr Asn 1 5 10 15 Arg Gln Val Thr Tyr Ser Lys Arg Arg Asn Gly Ile Phe Lys Lys Ala 20 25 30 Gln Glu Leu Thr Val Leu Cys Asp Ala Lys Val Ser Leu Ile Met Phe 35 40 45 Ser Asn Thr Asn Lys Leu Asn Glu Tyr Ile Ser Pro Ser Thr Ser Thr 50 55 60 Lys Lys Ile Tyr Asp Gln Tyr Gln Asn Ala Leu Gly Ile Asp Leu Trp 65 70 75 80 Gly Thr Gln Tyr Glu Lys Met Gln Glu His Leu Arg Lys Leu Asn Asp 85 90 95 Ile Asn His Lys Leu Arg Gln Glu Ile Arg Gln Arg Arg Gly Glu Gly 100 105 110 Leu Asn Asp Leu Ser Ile Asp His Leu Arg Gly Leu Glu Gln His Met 115 120 125 Thr Glu Ala Leu Asn Gly Val Arg Gly Arg Lys Tyr His Val Ile Lys 130 135 140 Thr Gln Asn Glu Thr Tyr Arg Lys Lys Val Lys Asn Leu Glu Glu Arg 145 150 155 160 His Gly Asn Leu Leu Met Glu Tyr Glu Ala Lys Leu Glu Asp Arg Gln 165 170 175 Tyr Gly Leu Val Asp Asn Glu Ala Ala Val Ala Leu Ala Asn Gly Ala 180 185 190 Ser Asn Leu Tyr Ala Phe Arg Leu His His Gly His Asn His His His 195 200 205 His Leu Pro Asn Leu His Leu Gly Asp Gly Phe Gly Ala His Glu Leu 210 215 220 Arg Leu Pro 225 5 5656 DNA Populus balsamifera subsp. trichocarpa 5 agtatatata ctaaataaat atataaactt gtaaaaaata aaagaaaaat aatcattgca 60 tgcaaactaa acaaacatta aaattatact taaacaaaac taatctaaaa tgaagttttt 120 aaaaggtaat tatgacatag ccacgagcca ctcaataaac ctttataaga tttaaattga 180 tgctaaaata tatttttttt tattttttgc atcattaaag aaataactca aaagcatctt 240 ttatttttta aatattaatt tattagaaca atacttgata tctattgaaa taatactcaa 300 tatctatcta taaatcaaaa aacctaaact ctagattgta aaaaataata ataatagaag 360 agccacccat ccaaaacttc tatattattt gacttgaaag caaaaacatt aatacacata 420 attcatgaaa aatactcatg aaagtctata attcacaaaa gaattgatga atattcatat 480 atagttcact aataacattc attttcatca tataattaac gtattaattc aagtactaaa 540 atattttatg aactaaaaga aattattgat caaagaaaga ctcaataaca aatatttttt 600 tattaatcaa actcaaattc aaattcatga accctcaaat ccattatcaa atccataaac 660 ctatttgggg tttgagattt ttgttatcca agggttttat ggagacaatt tatcattccc 720 tttttattag tcttttttat tatatattaa tattttatat taaaatacta attacaaaat 780 tcaatatgat tttaatcttg gacctcatat ataattccgc tttaaaactc cgactcatat 840 tctaaaccca attccaacat ggactaaaca attaatccca atattagagg gaacaaatta 900 tttatttctt aacaacacga aaactaaagt atatcactct gcaaaatgta attacaagtc 960 cttcgtgttt aggctagttt gaagatgcct gtggttggag accagagaca tcaaattaat 1020 gtttttttat agtaacatgt gctcaagttg catgcatttt tcgtaccaac aaaatacatg 1080 taaaatcatc atccattaat caaattgcaa tgattcatag catatgcata acgcatgtgt 1140 ctgtgcatgt tttagctggt tcaattcttg cagattgtac tgctaaatgt acgtactagc 1200 acctcaaatc acagtgacct cccaaatatt gcacagacct ctttgtttac aaatttcaag 1260 catcctaatt aatctcccaa gtgacatctg gtggccatgt tgcggccctg acaagcagct 1320 gagaaattct ccaacattag agggattcaa tgttctgttc aatgtttgga tacattgatt 1380 ctgcattgca acgctaatca cggtctgttc tccggcaagg gggggaaaaa caatgatcag 1440 ggataaggca gcgaatgtct ggtgaaaaca agggtatttt catacttttc tcaggttcgt 1500 gtagtcagca atgaacgaaa cgaggcaaat ccaaccaagt agaaaaacct catgagtaac 1560 gagaaagtcg aggagacagt atctggcacc ctcagatgca tcataccttg cgatgagcca 1620 gaaactaaga tgattctagt gacgtctaaa tcatcaatcc cacggttaaa aggacaccat 1680 aacccaagcc actagaatat ctgcttacgc agcaaccaca ctgcaaagcc acgacgaaga 1740 actacaaaga tacggatata acatgatata aatatattaa tacttaattc ttcaaggtct 1800 tggattatga acttttttgt tcatatttat tttattatat tgaaaaactc gaaataaata 1860 agacgattat tataagaatt cttaaatcat gtttatcaaa ttttgtccta tctagagacc 1920 attaataatt gtgtgtggat taattcacca aaaacttaaa tgaaaagtaa ctttatctat 1980 ctagagatgg aaaaggaact caattaccct caataataaa attggatgga aatcatctag 2040 atggtggtcc agtagtaaga ttttgggact aaaaggtttg ttctctttgt ggtctcaggt 2100 tcgagccatg tggttgctta tatgatgacc actgaaaatt tacatggtcg ttaacttcag 2160 ggcccgtggg attagtcgag gtgcgtcaag ttagtctgga cacccatatt aatctaaaaa 2220 aaaaaattaa atggcaaaaa atattttgaa tgttgaagta aaaaaagtga aagggaggta 2280 gtaaaacaat atacgaccta acaggagagg agtccaatca agtagatcat gtgtcaagag 2340 atgagtggat agaagaactt caagtgaaga atgtatgcag ggaaccaaat gtgtgaatga 2400 cacaaagatc tgactagttc gatttcaact gtccagttcc gaagaaacat caaaaccctt 2460 taattctgtt agcttcccaa tacatacaaa aaagaaaaaa agacaaaaaa ctcgtcctgt 2520 taagggcagt tttggtatat aaataaaaca agaagctcac ttgtctttat atatctacca 2580 aatccaagac atgcacctgt gaaagatcac agagagagag acaagggggc agatagatat 2640 ggatccggag gctttcacgg cgagtttgtt caaatgggat acgagagcaa tggtgccaca 2700 tcctaaccgt ctgcttgaaa tggtgccccc gcctcagcag ccaccggctg cggcgtttgc 2760 tgtaaggcca agggagctat gtgggctaga ggagttgttt caagcttatg gtattaggta 2820 ctacacggca gcaaaaatag ctgaactcgg gttcacagtg aacacccttt tggacatgaa 2880 agacgaggag cttgatgaaa tgatgaatag tttgtctcag atctttaggt gggatcttct 2940 tgttggtgag aggtatggta ttaaagctgc tgttagagct gaaagaagaa ggcttgatga 3000 ggaggatcct aggcgtaggc aattgctctc tggtgataat aatacaaata ctcttgatgc 3060 tctctcccaa gaaggtttgg ttagcattga ttctaccttt tagtgtaatt aagctaagct 3120 catactatta ctagctatag gagtccatgg ccaatttgtt gtagttttgt agagtaaatt 3180 aattctatgt atacttggat aagataatta gcttattata agatgttact tgccagctta 3240 taatttccat atacaacaat cattttcatt cccttttcct tttcttatat atgaaattta 3300 gttcaagtat aagtgcttgt acaccaatgt atgtttactc tagtcatatc aattctactt 3360 tgcagggttg gtttcttgct aattaatcac catgctcaat attagagtag taattctctt 3420 aactaagtcc aggttagcta gcttttggtt tcttgttaat tgccgcacat acttagctta 3480 aattagttct caaggtaata gttagcttaa tagctttgag ctcatactgg tttctataaa 3540 ataaatgaac aaaatctgat tgtttcgaaa aattaaataa cattaactta ttaaacttat 3600 tttcctttcc ttaattttta atttttgctt gtttcttggg tggttgtgtg ttcaggtttc 3660 tctgaggagc cagtacagca agacaaggag gcagcaggga gcggtggaag agggacatgg 3720 gaggcagtgg cagcggggga gaggaagaaa cagtcagggc ggaagaaagg ccaaagaaag 3780 gtggtggacc ttgatggaga tgatgaacat ggtggtgcta tctgtgagag acagcgggag 3840 cacccattca ttgtaacaga gcctggtgaa gtggcacgtg gcaaaaagaa cggtcttgat 3900 tacctcttcc atttatatga acagtgtcgt gatttcttga tccaagtcca aagcattgcg 3960 aaggagaggg gagaaaaatg ccccactaag gtacgaagag tcagcttcgc gagggattga 4020 tttttattta gaaatatatt aaaataatat tttttatatt ttaaaattta tttttaatat 4080 taatatatta aaataatata aaaatactga aaaataattt tttaaaaaat aatttttttt 4140 caaaaatatt tacaaaacaa actgtgtcta aagaacacat ttagaccgtt aatttctgca 4200 agtctcaaca tttcaatggt tcttgtcttg gacccacata gaccagccat tgtattctgg 4260 actggactgg agttatgccc ccacctgaat ttgcctttca cagctgtccc gataaaaacg 4320 tgacaactca tgtactggtt tctggtccct gtcattttag acctgctatt tgcagtggga 4380 tacttattgg ttactcttac tagtcgatca tcgttatttg aatatttcaa atattctgat 4440 tttggaagtt tgtacgatgt cgtgtcacgt ggatcttgtg aaacctggtt gatgtcaact 4500 attgtcgaac tggaccaaaa tccattacat tctgagtttc tctagtgttt tcctgccatg 4560 gaacctgaaa gcccatgttg atggttagga cttagaattt gattagccct aaatggaaca 4620 gtgagtaatt atgctaagaa aaatggtttt tttttgtttt gttttgtgtt tggttatagg 4680 tgacaaatca ggtgtttagg tatgccaaga aggcaggagc aagctacatc aacaagccca 4740 aaatgagaca ctacgtgcat tgctatgctt tacattgcct cgatgaggac gcatccaatg 4800 cacttaggag agcgttcaag gagagaggag aaaatgttgg agcatggaga caggcttgtt 4860 acaagcccct tgtagccatc gcatcacgcc aaggctggga catagattcc attttcaatg 4920 ctcatcctcg gcttgccatt tggtatgtgc cgaccaagct ccgtcaactt tgttatgcag 4980 agcgcaatag tgccacttct tcaagctctg tctctggtac tggaggtcac ctgccgtttt 5040 gagttcttaa ttatgccaag ataaatactc ctatctctat aaaattgtca aaatgtatgt 5100 tgtagcgagg tcaggacaaa gtattggttg atggaggatg gttcattaaa tttcacatcc 5160 ttgactattt atatatcatg atatgcttaa aggctctaat cattgtttac gtcgatggaa 5220 ctattatatt tctaatttag ttttcaggga agtctaggct gctggtgcct acagtgtcca 5280 taaatttgag caaaatggcc aaaaggggcc aattgggacc cactaaatta atttggtggt 5340 gcagtccccc ttacaatacg actgcatgta atacttgtcc aaaatttgag tgcagttcat 5400 aggctgttac tttaaacaga caaacacatg atgacaagat aaaaggcatg gataattctt 5460 gtcttcttga ggtgccaaca tgcaaaatgc catgtcaggt tgttgatttg atttctaatt 5520 gttaaccatt actgtttttt ttgccataac catgcaatgg tgctaaagtt agatgccata 5580 aaagatgtat catggcagcc tgcaatgcaa ataaaaacgg ggaaacaatg gaaagttgcc 5640 agaaatttca attact 5656 6 1308 DNA Populus balsamifera subsp. trichocarpa CDS (12)..(1145) 6 ggcagataga t atg gat ccg gag gct ttc acg gcg agt ttg ttc aaa tgg 50 Met Asp Pro Glu Ala Phe Thr Ala Ser Leu Phe Lys Trp 1 5 10 gat acg aga gca atg gtg cca cat cct aac cgt ctg ctt gaa atg gtg 98 Asp Thr Arg Ala Met Val Pro His Pro Asn Arg Leu Leu Glu Met Val 15 20 25 ccc ccg cct cag cag cca ccg gct gcg gcg ttt gct gta agg cca agg 146 Pro Pro Pro Gln Gln Pro Pro Ala Ala Ala Phe Ala Val Arg Pro Arg 30 35 40 45 gag cta tgt ggg cta gag gag ttg ttt caa gct tat ggt att agg tac 194 Glu Leu Cys Gly Leu Glu Glu Leu Phe Gln Ala Tyr Gly Ile Arg Tyr 50 55 60 tac acg gca gca aaa ata gct gaa ctc ggg ttc aca gtg aac acc ctt 242 Tyr Thr Ala Ala Lys Ile Ala Glu Leu Gly Phe Thr Val Asn Thr Leu 65 70 75 ttg gac atg aaa gac gag gag ctt gat gaa atg atg aat agt ttg tct 290 Leu Asp Met Lys Asp Glu Glu Leu Asp Glu Met Met Asn Ser Leu Ser 80 85 90 cag atc ttt agg tgg gat ctt ctt gtt ggt gag agg tat ggt att aaa 338 Gln Ile Phe Arg Trp Asp Leu Leu Val Gly Glu Arg Tyr Gly Ile Lys 95 100 105 gct gct gtt aga gct gaa aga aga agg ctt gat gag gag gat cct agg 386 Ala Ala Val Arg Ala Glu Arg Arg Arg Leu Asp Glu Glu Asp Pro Arg 110 115 120 125 cgt agg caa ttg ctc tct ggt gat aat aat aca aat act ctt gat gct 434 Arg Arg Gln Leu Leu Ser Gly Asp Asn Asn Thr Asn Thr Leu Asp Ala 130 135 140 ctc tcc caa gaa ggt ttc tct gag gag cca gta cag caa gac aag gag 482 Leu Ser Gln Glu Gly Phe Ser Glu Glu Pro Val Gln Gln Asp Lys Glu 145 150 155 gca gca ggg agc ggt gga aga ggg aca tgg gaa gca gtg gca gcg ggg 530 Ala Ala Gly Ser Gly Gly Arg Gly Thr Trp Glu Ala Val Ala Ala Gly 160 165 170 gag agg aag aaa cag tca ggg cgg aag aaa ggc caa aga aag gtg gtg 578 Glu Arg Lys Lys Gln Ser Gly Arg Lys Lys Gly Gln Arg Lys Val Val 175 180 185 gac ctt gat gga gat gat gaa cat ggt ggt gct atc tgt gag aga cag 626 Asp Leu Asp Gly Asp Asp Glu His Gly Gly Ala Ile Cys Glu Arg Gln 190 195 200 205 cgg gag cac cca ttc att gta aca gag cct ggt gaa gtg gca cgt ggc 674 Arg Glu His Pro Phe Ile Val Thr Glu Pro Gly Glu Val Ala Arg Gly 210 215 220 aaa aag aac ggt ctt gat tac ctc ttc cat tta tat gaa cag tgt cgt 722 Lys Lys Asn Gly Leu Asp Tyr Leu Phe His Leu Tyr Glu Gln Cys Arg 225 230 235 gat ttc ttg atc caa gtc caa agc att gcg aag gag agg gga gaa aaa 770 Asp Phe Leu Ile Gln Val Gln Ser Ile Ala Lys Glu Arg Gly Glu Lys 240 245 250 tgc ccc act aag gtg aca aat cag gtg ttt agg tat gcc aag aag gca 818 Cys Pro Thr Lys Val Thr Asn Gln Val Phe Arg Tyr Ala Lys Lys Ala 255 260 265 gga gca agc tac atc aac aag ccc aaa atg aga cac tac gtg cat tgc 866 Gly Ala Ser Tyr Ile Asn Lys Pro Lys Met Arg His Tyr Val His Cys 270 275 280 285 tat gct tta cat tgc ctc gat gag gac gca tcc aat gca ctt agg aga 914 Tyr Ala Leu His Cys Leu Asp Glu Asp Ala Ser Asn Ala Leu Arg Arg 290 295 300 gcg ttc aag gag aga gga gaa aat gtt gga gca tgg aga cag gct tgt 962 Ala Phe Lys Glu Arg Gly Glu Asn Val Gly Ala Trp Arg Gln Ala Cys 305 310 315 tac aag ccc ctt gta gcc atc gca tca cgc caa ggc tgg gac ata gat 1010 Tyr Lys Pro Leu Val Ala Ile Ala Ser Arg Gln Gly Trp Asp Ile Asp 320 325 330 tcc att ttc aat gct cat cct cgg ctt gcc att tgg tat gtg ccg acc 1058 Ser Ile Phe Asn Ala His Pro Arg Leu Ala Ile Trp Tyr Val Pro Thr 335 340 345 aag ctc cgt caa ctt tgt tat gca gag cgc aat agt gcc act tct tca 1106 Lys Leu Arg Gln Leu Cys Tyr Ala Glu Arg Asn Ser Ala Thr Ser Ser 350 355 360 365 agc tct gtc tct ggt act gga ggt cac ctg ccg ttt tga gttcttaatt 1155 Ser Ser Val Ser Gly Thr Gly Gly His Leu Pro Phe 370 375 atgccaagat aaatactcct atctctataa aattgtcaaa atgtatgttg tagcgaggtc 1215 aggacaaagt attggttgat ggaggatggt tcattaaatt tcacatcctt gactatttat 1275 atatcatgat atgcttaaag gctctaaaaa aaa 1308 7 1131 DNA Populus balsamifera subsp. trichocarpa CDS (1)..(1131) 7 atg gat ccg gag gct ttc acg gcg agt ttg ttc aaa tgg gat acg aga 48 Met Asp Pro Glu Ala Phe Thr Ala Ser Leu Phe Lys Trp Asp Thr Arg 1 5 10 15 gca atg gtg cca cat cct aac cgt ctg ctt gaa atg gtg ccc ccg cct 96 Ala Met Val Pro His Pro Asn Arg Leu Leu Glu Met Val Pro Pro Pro 20 25 30 cag cag cca ccg gct gcg gcg ttt gct gta agg cca agg gag cta tgt 144 Gln Gln Pro Pro Ala Ala Ala Phe Ala Val Arg Pro Arg Glu Leu Cys 35 40 45 ggg cta gag gag ttg ttt caa gct tat ggt att agg tac tac acg gca 192 Gly Leu Glu Glu Leu Phe Gln Ala Tyr Gly Ile Arg Tyr Tyr Thr Ala 50 55 60 gca aaa ata gct gaa ctc ggg ttc aca gtg aac acc ctt ttg gac atg 240 Ala Lys Ile Ala Glu Leu Gly Phe Thr Val Asn Thr Leu Leu Asp Met 65 70 75 80 aaa gac gag gag ctt gat gaa atg atg aat agt ttg tct cag atc ttt 288 Lys Asp Glu Glu Leu Asp Glu Met Met Asn Ser Leu Ser Gln Ile Phe 85 90 95 agg tgg gat ctt ctt gtt ggt gag agg tat ggt att aaa gct gct gtt 336 Arg Trp Asp Leu Leu Val Gly Glu Arg Tyr Gly Ile Lys Ala Ala Val 100 105 110 aga gct gaa aga aga agg ctt gat gag gag gat cct agg cgt agg caa 384 Arg Ala Glu Arg Arg Arg Leu Asp Glu Glu Asp Pro Arg Arg Arg Gln 115 120 125 ttg ctc tct ggt gat aat aat aca aat act ctt gat gct ctc tcc caa 432 Leu Leu Ser Gly Asp Asn Asn Thr Asn Thr Leu Asp Ala Leu Ser Gln 130 135 140 gaa ggt ttc tct gag gag cca gta cag caa gac aag gag gca gca ggg 480 Glu Gly Phe Ser Glu Glu Pro Val Gln Gln Asp Lys Glu Ala Ala Gly 145 150 155 160 agc ggt gga aga ggg aca tgg gaa gca gtg gca gcg ggg gag agg aag 528 Ser Gly Gly Arg Gly Thr Trp Glu Ala Val Ala Ala Gly Glu Arg Lys 165 170 175 aaa cag tca ggg cgg aag aaa ggc caa aga aag gtg gtg gac ctt gat 576 Lys Gln Ser Gly Arg Lys Lys Gly Gln Arg Lys Val Val Asp Leu Asp 180 185 190 gga gat gat gaa cat ggt ggt gct atc tgt gag aga cag cgg gag cac 624 Gly Asp Asp Glu His Gly Gly Ala Ile Cys Glu Arg Gln Arg Glu His 195 200 205 cca ttc att gta aca gag cct ggt gaa gtg gca cgt ggc aaa aag aac 672 Pro Phe Ile Val Thr Glu Pro Gly Glu Val Ala Arg Gly Lys Lys Asn 210 215 220 ggt ctt gat tac ctc ttc cat tta tat gaa cag tgt cgt gat ttc ttg 720 Gly Leu Asp Tyr Leu Phe His Leu Tyr Glu Gln Cys Arg Asp Phe Leu 225 230 235 240 atc caa gtc caa agc att gcg aag gag agg gga gaa aaa tgc ccc act 768 Ile Gln Val Gln Ser Ile Ala Lys Glu Arg Gly Glu Lys Cys Pro Thr 245 250 255 aag gtg aca aat cag gtg ttt agg tat gcc aag aag gca gga gca agc 816 Lys Val Thr Asn Gln Val Phe Arg Tyr Ala Lys Lys Ala Gly Ala Ser 260 265 270 tac atc aac aag ccc aaa atg aga cac tac gtg cat tgc tat gct tta 864 Tyr Ile Asn Lys Pro Lys Met Arg His Tyr Val His Cys Tyr Ala Leu 275 280 285 cat tgc ctc gat gag gac gca tcc aat gca ctt agg aga gcg ttc aag 912 His Cys Leu Asp Glu Asp Ala Ser Asn Ala Leu Arg Arg Ala Phe Lys 290 295 300 gag aga gga gaa aat gtt gga gca tgg aga cag gct tgt tac aag ccc 960 Glu Arg Gly Glu Asn Val Gly Ala Trp Arg Gln Ala Cys Tyr Lys Pro 305 310 315 320 ctt gta gcc atc gca tca cgc caa ggc tgg gac ata gat tcc att ttc 1008 Leu Val Ala Ile Ala Ser Arg Gln Gly Trp Asp Ile Asp Ser Ile Phe 325 330 335 aat gct cat cct cgg ctt gcc att tgg tat gtg ccg acc aag ctc cgt 1056 Asn Ala His Pro Arg Leu Ala Ile Trp Tyr Val Pro Thr Lys Leu Arg 340 345 350 caa ctt tgt tat gca gag cgc aat agt gcc act tct tca agc tct gtc 1104 Gln Leu Cys Tyr Ala Glu Arg Asn Ser Ala Thr Ser Ser Ser Ser Val 355 360 365 tct ggt act gga ggt cac ctg ccg ttt 1131 Ser Gly Thr Gly Gly His Leu Pro Phe 370 375 8 377 PRT Populus balsamifera subsp. trichocarpa 8 Met Asp Pro Glu Ala Phe Thr Ala Ser Leu Phe Lys Trp Asp Thr Arg 1 5 10 15 Ala Met Val Pro His Pro Asn Arg Leu Leu Glu Met Val Pro Pro Pro 20 25 30 Gln Gln Pro Pro Ala Ala Ala Phe Ala Val Arg Pro Arg Glu Leu Cys 35 40 45 Gly Leu Glu Glu Leu Phe Gln Ala Tyr Gly Ile Arg Tyr Tyr Thr Ala 50 55 60 Ala Lys Ile Ala Glu Leu Gly Phe Thr Val Asn Thr Leu Leu Asp Met 65 70 75 80 Lys Asp Glu Glu Leu Asp Glu Met Met Asn Ser Leu Ser Gln Ile Phe 85 90 95 Arg Trp Asp Leu Leu Val Gly Glu Arg Tyr Gly Ile Lys Ala Ala Val 100 105 110 Arg Ala Glu Arg Arg Arg Leu Asp Glu Glu Asp Pro Arg Arg Arg Gln 115 120 125 Leu Leu Ser Gly Asp Asn Asn Thr Asn Thr Leu Asp Ala Leu Ser Gln 130 135 140 Glu Gly Phe Ser Glu Glu Pro Val Gln Gln Asp Lys Glu Ala Ala Gly 145 150 155 160 Ser Gly Gly Arg Gly Thr Trp Glu Ala Val Ala Ala Gly Glu Arg Lys 165 170 175 Lys Gln Ser Gly Arg Lys Lys Gly Gln Arg Lys Val Val Asp Leu Asp 180 185 190 Gly Asp Asp Glu His Gly Gly Ala Ile Cys Glu Arg Gln Arg Glu His 195 200 205 Pro Phe Ile Val Thr Glu Pro Gly Glu Val Ala Arg Gly Lys Lys Asn 210 215 220 Gly Leu Asp Tyr Leu Phe His Leu Tyr Glu Gln Cys Arg Asp Phe Leu 225 230 235 240 Ile Gln Val Gln Ser Ile Ala Lys Glu Arg Gly Glu Lys Cys Pro Thr 245 250 255 Lys Val Thr Asn Gln Val Phe Arg Tyr Ala Lys Lys Ala Gly Ala Ser 260 265 270 Tyr Ile Asn Lys Pro Lys Met Arg His Tyr Val His Cys Tyr Ala Leu 275 280 285 His Cys Leu Asp Glu Asp Ala Ser Asn Ala Leu Arg Arg Ala Phe Lys 290 295 300 Glu Arg Gly Glu Asn Val Gly Ala Trp Arg Gln Ala Cys Tyr Lys Pro 305 310 315 320 Leu Val Ala Ile Ala Ser Arg Gln Gly Trp Asp Ile Asp Ser Ile Phe 325 330 335 Asn Ala His Pro Arg Leu Ala Ile Trp Tyr Val Pro Thr Lys Leu Arg 340 345 350 Gln Leu Cys Tyr Ala Glu Arg Asn Ser Ala Thr Ser Ser Ser Ser Val 355 360 365 Ser Gly Thr Gly Gly His Leu Pro Phe 370 375 9 11485 DNA Populus balsamifera subsp. trichocarpa misc_feature (8507) n represents a, c, t, or g 9 ggatccacct ccacgtcagt ccatccgcat tcgtaagtcc ataaaactac cagattttgc 60 ttattcttgt tattcttcat catttacttc ctttttagct tctattcatt gcctttttga 120 gccctcttcc tataaagagg caattcttga tccgcttcgg caacaagcta tgaatgaaga 180 attttctgct ttgcataaga cagatacttg ggatctggtt cctctacctc ccggtaagag 240 tgttgttggt tgtcattggg tgtataagat caagactaat tctgatgggt ctattgagca 300 atacaaagct aggctggttg caaaaggata ctctcaacat tatggtatgg actatgagga 360 aacatttgcc ccggttgcaa aaatgactac tattcgtact cttattgtcg tagcttcgat 420 tcgtcagtgg catatttctc agcttgatgt taaaaatgcc ttcttgaatg gagatcttca 480 agaagaagtt tatgtggcac tccctcctgg tatttcatat gactctggat atgtttgtaa 540 gcttaagaaa gcattaatta tatggtctca aacaagcacc ccgtgcttgg tttgagaaat 600 tctctattgt gatctcgtct cttggcattg tttctagcag tcatgattct gctcttttta 660 ttaagtgcac tgatgcaggt cgtatcattc tgtctttata tgttgataac atgattatta 720 ttggtgatga cattgatggt atttcagtct tgaagacaaa gttggctaga cgatttgaaa 780 tgaaagattt gggttatctt caatatttcc tgggtattga ggtagcatac tcacctagag 840 gttaccttct ttctcagtcg aaatatgttg cagatattct tgagcagact agacttactg 900 ataacaaaac tgtagatact cctattgagg tcaacgtgag gtactcttct tctgatggtt 960 tacctttgat agatcttact ttataccaca ctattgttag gagtttggta tatctcacca 1020 ttactcgtcc agatattgca tatgctgttc atgttgttag tcagtttgtt gcttctctta 1080 ctactgttca ctgggcagct gttattcgta ttttgcgata tcttcggggt acagtttttc 1140 agagtctttt actttcatcc acctctttct tggagttgcg tgcatactct gatgctgatc 1200 atggtagtga tcccacagat cgcaagtctg ttaccgggtt ctgtatcttt ttaggtgatt 1260 ctcttatttc ttggaagagc aagaaacaat ctattgtttc tcaatcatcc atcgaagcag 1320 aatatcgtgc catgacatct actaccaaag agattgtttg gttatgttgg ttacttgctg 1380 atatgagagt ttcattttct catcctactc ctatgtattg tgacaaccag agttctattc 1440 agattgctca caactcggtt tttcatgagc gaactaagca cattgagatc gattgtcatc 1500 ttactcatca tcatctcaag catggcacca ttgctttacc ttttgttcct tcttccttac 1560 agattgcaga tttctttatc aaggcgcatt ccatctctcg tttttgtttt caggttggca 1620 aactctcgat gcttgtagct gccgcattgt gagtttgagg ggagatgtta aataatattt 1680 atgtagtctt atttattaag ggtagaatag tactttcagt ttaacctata tatactttat 1740 ttgtatttag gttaagacta agcattcata ataaatgtat cattaagaat tctagcctcc 1800 ttttcgtgtt tcattttaat tatttttaac aatcttgtat taatatatgt gaattgattg 1860 aattaaatct tgtaatacaa tttaattgat tctaagttaa acaatctgct ggggaacatt 1920 catacaacta tctttctttt cgtttcaagt aggcaggaaa taaaacgttt ttagtttagg 1980 tgactaaaca atggaattta atgaaataag ggtagagatg aggtctgagg ttatcttgtt 2040 aagcaccttc ccatttgaac catgattttg tcgttaagca ctgagagtgt aacttagccc 2100 taaaacgtct cactcacccc attataattc attttcagaa agtcccttgc ttttctctct 2160 aatgacctaa atcatttcct tgaaagccaa aaataaaaaa taaaaacgaa tatagtggag 2220 agttattgag gtctgaatct gacgacagat tcccaccttt agcctcttct ttttaattcc 2280 tcttcaatgc tcaccactca tcaataccaa gataagaaaa agaaaaaaaa atggaaaaat 2340 tattgaagaa gagaaattac aaagacagta gttagacttg gtagaagtat tgttatatat 2400 aaagattgga tgagaggttg tttttcactt tataaatacc cacctcttag cccaaacttg 2460 cttccatttt cttcatctct ctactagtta gatttgtagg agaaatccca aaggaaaaga 2520 tcctcacttt ctctacacat taactgctat ctacagcccc tagctacttt gttttatttc 2580 ctcccaaggt tagttactaa aacatggagt cataaatctc gttgtattct tcagtgcttc 2640 atcacttgtt ttgggctaat taatcaatct tttcacgttt caaaacccac ctcttctttt 2700 tctgttttga tcactcagaa accccaaaaa atacaacttt caaacatttc tgtctccctt 2760 tcccatttca atctccagat tgaagcacca gtgatttatt tttgttttgt tgattgatta 2820 ttttgaccat aaccaataaa ccataacaat cgcaattcag aagctccaga cgttcatcga 2880 cccctttttc ttatgtttat tttatattac ttccatcctg gactactcat ttggacaaaa 2940 aaagtattgc taaatatgct atgagttgtg catatattat tcttgaatta gtagtatttt 3000 tttcatttta ttacattttt tgtgttgtca ctcagtttgt gttttggatc agctagctag 3060 gctgcagcta tggaatatca aaatgaatcc cttgagagct cccccctgag gaagctggga 3120 aggggaaagg tggagatcaa gcggatcgag aacaccacca atcgccaagt cactttctgc 3180 aaaaggcgca gtggtttgct caagaaagcc tacgaattat ctgttctttg cgatgctgag 3240 gttgcactca tcgtcttctc tagccgcggt cgcctttatg agtactctaa cgataggtaa 3300 ataaatctaa ttttagatat ttgcttctct ggatcttaaa ttctccatgt tacaagccct 3360 ctatcttcat gtggtcactt tttttttttt tttatcttcc tttctgcccc aaagagattt 3420 ttttatcctc tctattttgc ttatgttagt gttaattttt agctttaatt ggtttctttc 3480 attttcattt tctttctttc atgaatgatc attaaatggt tttcaatttc taaggtggga 3540 aatttattat tattattatt attttgtgtt taatctctgg gtaaaggatt taaagcaaaa 3600 gagacacaat cattccttat gctgcagttt agattgagtt tcttatctaa ctgagattca 3660 cttgtctttc tttctttctt tctcttctct taccctttag acgatgctga tgcacacgtt 3720 attttgagtt cttggtttgg taaaaacata gatctggtat aataaacaga catagaagca 3780 ctatatgagt gtagtatggt agcagaaata agtataggtc tgtgagatca gcctctttat 3840 ctcctccctt gttgttaatt ttgttgtttc cgtttttctt tctcttccat tattcctctt 3900 gcactctcta tctctcgctt tttttttgca catacttgtt tgtttgtgtc atctacgagg 3960 ctaaagagat tgcctatagc caaagctgtc atcttctcat tagtccaaac cctccatctc 4020 ttttcacttc ctagttaaat agcacgtcaa ttagacatca agaaagcaaa agtaccatgt 4080 caaataaccg tgaaaaagaa gaagaacaaa gaaaggtttt tttaatttgt catgtcactc 4140 aaacatatat tattagggtt tcaaatccca aatccccaga tgggtttttc atcttatttt 4200 atttttccaa accaatccag ggtttttccc ctaatcacac gaaatttccc aaaatctcag 4260 tttgaaccca cgaggggata gtgaaaacct ttctgttagt caatgcataa ccccagttag 4320 ggttcatagt tagggttcat attcaagtaa ccacatgaaa tcatcgaaat cgtacattaa 4380 cattcaagga aaactgttaa atcaagcaag tggacccttc cacaaccaat caaaactcag 4440 ttagatttca cctagatttt tacccctttt ttaacctggg taagtatggt acagtaatcg 4500 gttagggttt agtagccagt caaatagatc agattgttgt tcgggtttat gaacagaatc 4560 tttggtaacg tcacacacga tttttcagtt cttgcctact gacaaaaggc tttatgtcat 4620 gattccttaa actgaaccca agatttttaa cttccgatcc ccctggaaaa aatatgaaat 4680 tccaaaaatt gtccatttct tctccttaga tctctctcta tctctctccc ggttaaattg 4740 tttccatggt gaaagcagag agatggatca atgagaatgg gttaaccaag gccataatga 4800 tggcactgtt taagatcttg tatagatata tttatataag tttttttttt ttttaattta 4860 aagagagatt tagccccatt tgtattttta cggtgagaaa acacttttat aaaaaattga 4920 tattttttta aaaattattt tttatatttt ttagattatt tttatgtgtt aatattaaaa 4980 ataaattttt taaaatataa aaaatattat attaatatat tttaaataaa aaattaaccg 5040 ttgatgacaa tattgagaga aagagagtcg tgaagagaga atgaacgaca actgttaacc 5100 agtggaagag ttctgtcaat tttggtttct tctatgtaat agaaagccta caactctagc 5160 tggtattgta cggctctgct tctctcagag tttcagtctg agactaataa aatgtccgat 5220 tagtacaata ttttattaca atgaaataga atatcgaggt gggtaataga gtgagtttaa 5280 ggagattatc cactatgtaa tgggttattg acacgtggag aatatttgac cgctgatcta 5340 ccttggccaa tcatattgta ggattcagtg acagcttggc agagacagcc aatcaatgtc 5400 tcgacgaagt taaggtataa ggaaatctag aaaagcggtt cttgtctgaa ttgacaagat 5460 gtgttcacat tttactgaga ttattatggc aaaattttag gatttccttc gcattgtgtc 5520 gaggaaagac tggataatca gactgactcg gagagctgtg gttttgtcat tcatcttctt 5580 tttagggttt tctacgagtt aacttaatgg agttattcgt tgatttgact gtttaattgc 5640 cttaccgtca agctttgtta taataaggat tttttaaatt gtttttttta tttataaata 5700 tattaaaata atatttttta atttttaaga tggcatatca aaaatatttt aaaaaataaa 5760 aaaataattt gaaataaaac aaaaattaat ttttttaaaa caatattttt aacgcaataa 5820 caaattctta atcttttact catatatctt aaatttacga gagttttttc caaaaagata 5880 aagagatata tgtaagcgat aaagtattag taacctcaca taaaataatg tacaataata 5940 gataaaaact aaattttata taaaaattga atttcaatcc actttctttt ttcgtggatc 6000 ataaggagtt ggacttgctt ttttcacggt aatttgacca aagaaagagt taatacaaat 6060 aatattaatt aagatattat ctcttgttgt ttgttcttgt tttgaaataa tttagttttt 6120 tttttaagaa aaaaaagttt ttccaataca taagcaatac aaaagtgttt gaacatggta 6180 attcttcttc ttcttagttg accaaattac atttggtaga ctaaagttgt tcatatatat 6240 gctaccattg atagagtcat tggccaatta tatgttttta cgtcattata tttgaattct 6300 tttgttaata gtaattatta atcactgaag ttattgcatt cttgtcagct gataaactcc 6360 aagttgtaat tttatgtttg atcttgtaat taagagcaag ccaggaggac atctctagtg 6420 ttcgaggaaa ttgacaaaat ttgcttcctc aaatatattt ttgtttttca ttggacaaaa 6480 atacatgtta tatatatata tatatatata tatatatata tatatatata taatgcctat 6540 attttgtgag tagttccata agtttaggat atgtttgagg tagtttaaca taagcatttg 6600 attttttttt tcaatcctta tatcaaaatt atcataaaac aattaaaaaa tcattaattt 6660 attttatttt tttaattaaa aaaaacactt ataaacacag tattacccaa atacagattt 6720 atgaagccgc catgtggtaa aaaaatacat gttagagata tcagaagttt acaagcatgt 6780 ttatatgcgt taatgtggca tatgaaatgt catatcaatt gcgttacaaa gcttttcttg 6840 tgctaagtgt ggcgttagta ataagcaagt gtttgtaaga attgtcaaca cgtgtgttta 6900 cttacttgaa agaacattaa ttgctaattt tattaaataa ttaatccttc ctattactat 6960 cttgggatag gttgaagagc ataaggaaaa gggttaccat gataaataca aaaaataaaa 7020 aaggaggaag gagtagtttt caattttatt ttaattgtca atactatgtg cttggtgaaa 7080 agttatctgt cctcattttt atttattgtt ttttacaaaa agcatagaat aatgtgtgtt 7140 tcatgtgttt ggttagaggt tatagatgaa aagctttaat aataaatagt agctaaatat 7200 acttcattgt ttgagtggta gaggagattt ttaaaattta tgaagactac aattctcttt 7260 catttcaaat aacatcccta ttttagtggt gagattaatg tatttgtttc tctttttcta 7320 ttttctttta tcaatattat atataaaact aaaatgcatc agtgttttac tatggattga 7380 tcataatgca attcactata aaataattga tgcttccctt aaaaaaccaa ataattaaac 7440 aaacactcag ggttaatttt gtattttcat atctttattg catagtgtaa ttatttctat 7500 gtccttgaaa aaagaaaaaa aacactaggg tttttttaaa aaagtttcat atttttttgt 7560 atagtgtaat tatcccactt ttggggccaa ctttttttta cctaaggtaa aggggtattt 7620 ttggtttttt tatgtttgtt tttttgcaat tattatatgg gatcaagagt gttatgatct 7680 ttttatataa aaaaaaaatg gttgacacgt gatctacaat tccccctccc ttttcattcc 7740 taaccttgaa agtcttagtg aaacatatag ttataataaa gaaatattat ctctagtttt 7800 gcaaattaat ttcataacat caattaaata ttctgataag gtaaagttat ttaggatgga 7860 gaaatttaca taatgaagcc tccttctgcc tgagtagtgc atttctatgg tatttatgag 7920 catcaattct acaatccatt gaagcaaaag aactaacctt cttgaaaccc tcttgcagat 7980 aattgtgagt gaatgtaagt ccactacgaa atattcacac gattacgcac ttagttatca 8040 ttaaactttg tttttggtgc tttgcatttt cttaattaga ttcttccaca gctttccaat 8100 gcacattttg atgacttttt ttattttatt tttcttgatg gaaatgttga catgattgca 8160 gtgtcaaatc aacaattgag aggtacaaaa aggcatctgc agattcttca aacactgggt 8220 ctgtttctga agccaatgct caggtaccat atatcagctc taactaacaa tttgtactca 8280 taatatctat tagatggagt tcaagcataa tattcctccc aataatttat tgccaatata 8340 gtgctatgct accacttcat tcactctttc ttgataaccc cagcttgtat aaaatctatt 8400 agatacctct aagtttttgc cttacctttc tcactagtgt ctgacatgac actagtgttc 8460 acatggatta gcatctcgga gttgaaggtt gtctggcttc ttcgaanatc cagggttttc 8520 aagaaggttt gtacattggg aggcccgtgg ttataaacct actgtgtaaa tggtttgata 8580 aataatgatt catcagattt gagtaatagt cttttaattt ctttgtaaat gttgtctatg 8640 ttttttccag tcctccctac acacactctg ataattataa ccaattttgt ttcgcttcct 8700 cctttcgcta tgctcctact gaatttattt ccagtttgat tcagtattat atgcatgttt 8760 acaagaaaat agaagggggg aatctacatc actgagattt tctacctgta ttttatcaac 8820 tgatctaata tgaacttgag gctcttaatt ttgttatata taatgtttta ttgccttttg 8880 ttcttgcatc tcagtactac cagcaagaag ctgccaagct gcgttcccaa attggtaatt 8940 tgcagaattc aaacaggtca gagcctgttt gatattgatc tatttgtcag atgatatcgt 9000 tttctcttcc aaactccgct taagtataaa ttatatttca ggcatatgct gggtgaagcg 9060 cttagttcat tgagtgtgaa ggaacttaag agtttggaaa tacgacttga gaaaggaata 9120 agcagaattc gttccaaaaa ggttttgata ctagtaccga attgatacta tcacattttt 9180 ttgttttact tggatatcac atttccatgt atggccatta acaagttttg tgttcatact 9240 ttcctgctat gtttctaaaa aattcctccc gcaaaccttg ccagaatgag ctgttgtttg 9300 cagaaatcga gtatatgcag aagagggtaa tgcttcttat gttatcacat ttcccattta 9360 tttaatattt attgttttct ggtggagtat attctatatg attgttatat attctgaggt 9420 aaaagtcatc tagtgtttat taacataatg attctatggt caacttattc cttcctgttt 9480 tcactccgag attttccttt gattccttga atgaaaatgc acattacagg aggttgactt 9540 gcacaacaat aaccagcttc tccgagcaaa ggtctttctt ctatctatct atttatccat 9600 ctcgagtgag ggcaaggatg cgtgcgtgtg catgaatgaa gatctctatg tcttatatcg 9660 ttagtgagct gtttataatt tagaaatatg aggcttatct tgatagtgca gatttcagag 9720 aatgaaagaa agcgacagag catgaatttg atgccaggag gagcagactt tgagatcgtg 9780 cagtctcaac catatgactc tcggaactat tctcaagtga atggattaca gcctgcaagt 9840 cattactcac atcaagatca gatggccctt cagttagtgt aagtatctcc tttgtaacga 9900 ataataggtt ttcattaacc ggacaaccag atttagtgtt gtgcattcat aaaatacaat 9960 taattacttt aatttggaga tgttccaaaa gttgcaactg catggttcat gggctctaat 10020 ttcttggaag tatataaccg atgctatgtc ttttcattct cataattact gatcagtccc 10080 ttatagatga ttatttgcag attcttatga ccattttccc attgagatta taagattttg 10140 acatcgaata gttggactag gagtaaagag ctgttgctgt tatttagcac cccaaaggaa 10200 atattatata cctctgaacc aattgaatgg ccgacctagg tttactgaaa tgtttagctg 10260 taagaaggtt aagtgttatc agattcccca agtgagaagt acatgtttct tagcatactt 10320 tatgtttcac gcaccttgat ttttcaaact ttgtttatcg atttctgaac taaagtgact 10380 acattataga acttgaacct aaaattactc tcctcactat aggtgaaatc agattacttg 10440 aaaatactac taaaaaaaat tatggcgttt gctggtattt ctaacatctt ttctgctaat 10500 cttgtattaa ttttctccta gatgaacttg ttattatgta aaaaggtttc attactcatg 10560 caatggtgca ctaatgcttg aggagttcca agtaactttg ctgtctcatg taaagaagag 10620 tgctgaagtt cactatggtt taacttctac tgcactgctt gatattgcca tgaactctga 10680 catcatttgg cttgatcttg ttctaaaatc taaatgaaat aattctctct tactatatat 10740 cttcttaacc ctttgcatat gattaagtgg tctttgatag gatatcatta aaacctcgca 10800 taaaagctac cattttataa atttcaaact ccacgacgca ttttctggtg attccattgc 10860 tgattattgt ttaaagacat cattattcca attagtacat gtataataat ttcctctgtt 10920 gttggtgcag ttaataatct ccaagtgcag cagtttctcg catttccata ttccatggag 10980 agtacctggg tttccattga gcgcaaaagc tacatgtatg ctaaaaaacc tgaagtagcg 11040 taaatcatat ttgtctgggt gggagggcct agtactcttc ctctatgtat taactatcct 11100 gtcccagtta agacataaga aatgtcagag aaggatttct tttctgtatg tttcatgaag 11160 gcattaagat gctgttacag ttgtgactaa cttattatat atgtcttact gcttcatctt 11220 gtgatatttt cttgcatgtt aatctgatta aagtgtagct tagaccattc accatgttaa 11280 tggtgacttg ttggtgacta ctagtagctg tagctctccg tagtactgct atgccttcaa 11340 aaaatgatgg gtcggaaatt actagctagc tagtattgct gtttcattca atctctgctt 11400 taacccaaaa atcaggacta gtggattagc atacctctca ccaggacaat gcactagagc 11460 acattttcat cttcttctca tattt 11485 10 1219 DNA Populus balsamifera subsp. trichocarpa CDS (196)..(921) 10 tgagaggttg tttttcactt tataaatacc cacctcttag cccaaacttg cttccatttt 60 cttcatctct ctactagtta gatttgtagg agaaatccca aaggaaaaga tcctcacttt 120 ctctacacat taactgctat ctacagcccc tagctacttt gttttatttc ctcccaagct 180 agctaggctg cagct atg gaa tat caa aat gaa tcc ctt gag agc tcc ccc 231 Met Glu Tyr Gln Asn Glu Ser Leu Glu Ser Ser Pro 1 5 10 ctg agg aag ctg gga agg gga aag gtg gag atc aag cgg atc gag aac 279 Leu Arg Lys Leu Gly Arg Gly Lys Val Glu Ile Lys Arg Ile Glu Asn 15 20 25 acc acc aat cgc caa gtc act ttc tgc aaa agg cgc agt ggt ttg ctc 327 Thr Thr Asn Arg Gln Val Thr Phe Cys Lys Arg Arg Ser Gly Leu Leu 30 35 40 aag aaa gcc tac gaa tta tct gtt ctt tgc gat gct gag gtt gca ctc 375 Lys Lys Ala Tyr Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu 45 50 55 60 atc gtc ttc tct agc cgc ggt cgc ctt tat gag tac tct aac gat agt 423 Ile Val Phe Ser Ser Arg Gly Arg Leu Tyr Glu Tyr Ser Asn Asp Ser 65 70 75 gtc aaa tca aca att gag agg tac aaa aag gca tct gca gat tct tca 471 Val Lys Ser Thr Ile Glu Arg Tyr Lys Lys Ala Ser Ala Asp Ser Ser 80 85 90 aac act ggg tct gtt tct gaa gcc aat gct cag tac tac cag caa gaa 519 Asn Thr Gly Ser Val Ser Glu Ala Asn Ala Gln Tyr Tyr Gln Gln Glu 95 100 105 gct gcc aag ctg cgt tcc caa att ggt aat ttg cag aat tca aac agg 567 Ala Ala Lys Leu Arg Ser Gln Ile Gly Asn Leu Gln Asn Ser Asn Arg 110 115 120 cat atg ctg ggt gaa gcg ctt agt tca ttg agt gtg aag gaa ctt aag 615 His Met Leu Gly Glu Ala Leu Ser Ser Leu Ser Val Lys Glu Leu Lys 125 130 135 140 agt ttg gaa ata cga ctt gag aaa gga ata agc aga att cgt tcc aaa 663 Ser Leu Glu Ile Arg Leu Glu Lys Gly Ile Ser Arg Ile Arg Ser Lys 145 150 155 aag aat gag ctg ttg ttt gca gaa atc gag tat atg cag aag agg gag 711 Lys Asn Glu Leu Leu Phe Ala Glu Ile Glu Tyr Met Gln Lys Arg Glu 160 165 170 gtt gac ttg cac aac aat aac cag ctt ctc cga gca aag att tca gag 759 Val Asp Leu His Asn Asn Asn Gln Leu Leu Arg Ala Lys Ile Ser Glu 175 180 185 aat gaa aga aag cga cag agc atg aat ttg atg cca gga gga gca gac 807 Asn Glu Arg Lys Arg Gln Ser Met Asn Leu Met Pro Gly Gly Ala Asp 190 195 200 ttt gag atc gtg cag tct caa cca tat gac tct cgg aac tat tct caa 855 Phe Glu Ile Val Gln Ser Gln Pro Tyr Asp Ser Arg Asn Tyr Ser Gln 205 210 215 220 gtg aat gga tta cag cct gca agt cat tac tca cat caa gat cag atg 903 Val Asn Gly Leu Gln Pro Ala Ser His Tyr Ser His Gln Asp Gln Met 225 230 235 gcc ctt cag tta gtt taa taatctccaa gtgcagcagt ttctcgcatt 951 Ala Leu Gln Leu Val 240 tccatattcc atggagagta cctgggtttc cattgagcgc aaaagctaca tgtatgctaa 1011 aaaacctgaa gtagcgtaaa tcatatttgt ctgggtggga gggcctagta ctcttcctct 1071 atgtattaac tatcctgtcc cagttaagac ataagaaatg tcagagaagg atttcttttc 1131 tgtatgtttc atgaaggcat taagatgctg ttacagttgt gactaactta ttatatatgt 1191 cttactgctt caaaaaaaaa aaaaaaaa 1219 11 723 DNA Populus balsamifera subsp. trichocarpa CDS (1)..(723) 11 atg gaa tat caa aat gaa tcc ctt gag agc tcc ccc ctg agg aag ctg 48 Met Glu Tyr Gln Asn Glu Ser Leu Glu Ser Ser Pro Leu Arg Lys Leu 1 5 10 15 gga agg gga aag gtg gag atc aag cgg atc gag aac acc acc aat cgc 96 Gly Arg Gly Lys Val Glu Ile Lys Arg Ile Glu Asn Thr Thr Asn Arg 20 25 30 caa gtc act ttc tgc aaa agg cgc agt ggt ttg ctc aag aaa gcc tac 144 Gln Val Thr Phe Cys Lys Arg Arg Ser Gly Leu Leu Lys Lys Ala Tyr 35 40 45 gaa tta tct gtt ctt tgc gat gct gag gtt gca ctc atc gtc ttc tct 192 Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu Ile Val Phe Ser 50 55 60 agc cgc ggt cgc ctt tat gag tac tct aac gat agt gtc aaa tca aca 240 Ser Arg Gly Arg Leu Tyr Glu Tyr Ser Asn Asp Ser Val Lys Ser Thr 65 70 75 80 att gag agg tac aaa aag gca tct gca gat tct tca aac act ggg tct 288 Ile Glu Arg Tyr Lys Lys Ala Ser Ala Asp Ser Ser Asn Thr Gly Ser 85 90 95 gtt tct gaa gcc aat gct cag tac tac cag caa gaa gct gcc aag ctg 336 Val Ser Glu Ala Asn Ala Gln Tyr Tyr Gln Gln Glu Ala Ala Lys Leu 100 105 110 cgt tcc caa att ggt aat ttg cag aat tca aac agg cat atg ctg ggt 384 Arg Ser Gln Ile Gly Asn Leu Gln Asn Ser Asn Arg His Met Leu Gly 115 120 125 gaa gcg ctt agt tca ttg agt gtg aag gaa ctt aag agt ttg gaa ata 432 Glu Ala Leu Ser Ser Leu Ser Val Lys Glu Leu Lys Ser Leu Glu Ile 130 135 140 cga ctt gag aaa gga ata agc aga att cgt tcc aaa aag aat gag ctg 480 Arg Leu Glu Lys Gly Ile Ser Arg Ile Arg Ser Lys Lys Asn Glu Leu 145 150 155 160 ttg ttt gca gaa atc gag tat atg cag aag agg gag gtt gac ttg cac 528 Leu Phe Ala Glu Ile Glu Tyr Met Gln Lys Arg Glu Val Asp Leu His 165 170 175 aac aat aac cag ctt ctc cga gca aag att tca gag aat gaa aga aag 576 Asn Asn Asn Gln Leu Leu Arg Ala Lys Ile Ser Glu Asn Glu Arg Lys 180 185 190 cga cag agc atg aat ttg atg cca gga gga gca gac ttt gag atc gtg 624 Arg Gln Ser Met Asn Leu Met Pro Gly Gly Ala Asp Phe Glu Ile Val 195 200 205 cag tct caa cca tat gac tct cgg aac tat tct caa gtg aat gga tta 672 Gln Ser Gln Pro Tyr Asp Ser Arg Asn Tyr Ser Gln Val Asn Gly Leu 210 215 220 cag cct gca agt cat tac tca cat caa gat cag atg gcc ctt cag tta 720 Gln Pro Ala Ser His Tyr Ser His Gln Asp Gln Met Ala Leu Gln Leu 225 230 235 240 gtt 723 Val 12 241 PRT Populus balsamifera subsp. trichocarpa 12 Met Glu Tyr Gln Asn Glu Ser Leu Glu Ser Ser Pro Leu Arg Lys Leu 1 5 10 15 Gly Arg Gly Lys Val Glu Ile Lys Arg Ile Glu Asn Thr Thr Asn Arg 20 25 30 Gln Val Thr Phe Cys Lys Arg Arg Ser Gly Leu Leu Lys Lys Ala Tyr 35 40 45 Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu Ile Val Phe Ser 50 55 60 Ser Arg Gly Arg Leu Tyr Glu Tyr Ser Asn Asp Ser Val Lys Ser Thr 65 70 75 80 Ile Glu Arg Tyr Lys Lys Ala Ser Ala Asp Ser Ser Asn Thr Gly Ser 85 90 95 Val Ser Glu Ala Asn Ala Gln Tyr Tyr Gln Gln Glu Ala Ala Lys Leu 100 105 110 Arg Ser Gln Ile Gly Asn Leu Gln Asn Ser Asn Arg His Met Leu Gly 115 120 125 Glu Ala Leu Ser Ser Leu Ser Val Lys Glu Leu Lys Ser Leu Glu Ile 130 135 140 Arg Leu Glu Lys Gly Ile Ser Arg Ile Arg Ser Lys Lys Asn Glu Leu 145 150 155 160 Leu Phe Ala Glu Ile Glu Tyr Met Gln Lys Arg Glu Val Asp Leu His 165 170 175 Asn Asn Asn Gln Leu Leu Arg Ala Lys Ile Ser Glu Asn Glu Arg Lys 180 185 190 Arg Gln Ser Met Asn Leu Met Pro Gly Gly Ala Asp Phe Glu Ile Val 195 200 205 Gln Ser Gln Pro Tyr Asp Ser Arg Asn Tyr Ser Gln Val Asn Gly Leu 210 215 220 Gln Pro Ala Ser His Tyr Ser His Gln Asp Gln Met Ala Leu Gln Leu 225 230 235 240 Val 13 10007 DNA Populus balsamifera subsp. trichocarpa 13 tccattattt caacaataga ttcatttaca ctagcatgga tacttcaatg aataagaagt 60 gtgttattgt ttggagtaat agacacctat aattctcaaa ccttttactt tatttttatt 120 tcctttgtta tattacattt ttcatttctt tattgggttt tcattgacag gatggctaga 180 ttaatatagt ttcttgactt taataaataa aaaaaagatc aagactctct tcacaaacct 240 ttacaaaatt gggcgctata tctaaactaa aaaacttaag attatatact atctaaggag 300 tagcacacta taaataacat tataaaggta gtttgttgag cggaactaga ctttgcaaaa 360 taactttcca atatagcttt tcttgttgat gttgaccttt taatttagga tcaaacactt 420 gtaaattaca attaaaaggc ttatttttgt ttgccatttt taccaagcaa tgttaggatt 480 gctagagatt agtttttcca tggaataaga agttatcttt aaagggctta aaagacctag 540 tagcttgaca aggctatgac ttgtgttgtt ttggatcgta tggttattgt tatagaggtg 600 ctagtggtta aagacatcca tcatggaggt ggtgatgact taaaagagtt agatgtaaat 660 tggagactta tgttattctc acataaaaaa tgttagcctc cgacattgtt tttggatgtg 720 taaaatcaat gtaccatttt attcttcatt gtttgttttc cttattatga cttttacaaa 780 tttatccttt aggtgatgaa attccttcaa tcttgttcta tttttttttt taattcttgg 840 tacgtagttc tgtacttaat caagcaacat aaaatagtga tgccatcttc atcactctat 900 aaacgtggaa acccaaatct ctggctttta ttcatgatta aagtcatttc tagatttttt 960 tagacgttca agtgagattt agggttcaat aagagaggat caatggtgaa aatagaagaa 1020 caaagttgtt gtggttaagt tgactcggtg gttgttgagt tgggatatga aggaatagat 1080 ggtagactaa tctagtgttt ttgtccactt gagttcttaa ttattattcc atctccatga 1140 ctatttccat cttcttcttc agtgatattg tttatactct gtgatttggg tttattggaa 1200 cttattattg aggcagctca tccatagaaa tttggtactt gcttcaacaa accactaaaa 1260 tgttgtgtgg ttaatatttg agaatgcgcg aaaaaagcat cgtactaaat ttgggttccc 1320 gactggatga agagagatgt gattacttaa tttatttgga ttttcggggt ttattagatt 1380 tttggaaagg taatacgata tcattggttt tgagaggaaa taacattggg attttgatga 1440 tttttgaata ataaaattaa gttttttctt gattcatttg ttaatagaaa gagaagaggg 1500 atagctctct tattctagca gaagtacgta tatgagctat gggatttaat tcttaatttt 1560 gtatgagtta ttgatcaaag aaaaagcaat gatgtgagaa gtctatatat ataatttctc 1620 ctacgtactc cgttgaacct tttttcctaa taaaaattga tagaaaatct acaacatata 1680 cagagaaatg tgaagttctt caattgagaa taaatcgttt caaaaggacg taggaatctc 1740 cttgtagtga gtgaaactcc aagaaaatta aacaacctgc tggggaacat ccatacaact 1800 atcctccgat cccttctttt cttttcaagt aggcaggcaa taaaacgtat ttagcatagc 1860 caagttcaaa aaaaaaacaa gaagaagaag aagcaatgaa ataagggaaa agatgaggtt 1920 ctcttgttaa gcacctttca tttgtaccat aattttgtcc ttggaatgat tagagagccc 1980 aaaaacgtgt tattcacccc agaaaaatcc attttcaaaa agtccctttc tcttgatgac 2040 ctaaatcatt cacatggaag ccaaggaaga aaatgaaaaa aacgaatata gtggatggtt 2100 attgaggtct cagtcttcct atagcgtatt ctctaattaa ttccaagata aaaaaaaaaa 2160 aaaattacaa ggatggtgta gataaactta gtagaaagta ttgttatata tatatatata 2220 tatatgggaa tggatgaaag gtcgtttatc acttttataa atgcccacct cttagcccca 2280 acttgcttcc attttctgca tctctcctac tcagattcgt aggaacaaag aagagagaaa 2340 ccccagagca aaagatcctt actttctctc cttaataact actatctcta caacccctac 2400 tttggtttat ttcctcccaa ggttagttac caaaacactg agacatatat ctcgttgtat 2460 tcttgagtgc ttcacttgtt tggggcttat caatcttctg atcttcttat ctcttcttca 2520 tcatagtgac tgaggaaccc catcagatga aacttttaat tttctaaaaa agatttactt 2580 acaaacgttt ctgtcactct ctgccgtttc aatctccaga ttgaagcatt actagttcat 2640 ccctttgttt tgtttctcaa ttattttcat atccatgaaa ccataacaag ggctaattca 2700 agagctagct gcaggcgttc atggaacccc tttcttctgt ttattttgtc ttccatcatg 2760 agctattcag tgctcaagag tattcctgct aaatatgcta tgaattatcc ttatatataa 2820 atcattcttg aattaattac tagctagtag ttcagtaatt ttattactct cttttctgct 2880 gtcttcaccc agtttgtgtt ttggatcagc tagctaggca gcagctatgg cataccaaaa 2940 tgaaccccaa gagagctctc ccctgaggaa gctggggagg ggaaaggtgg agatcaagcg 3000 gatcgagaac accaccaatc gccaagtcac tttctgcaaa aggcggaatg gtttgctcaa 3060 gaaagcctat gaattatctg ttctttgcga tgctgaggtt gcactcatcg tcttctccag 3120 ccgtggacgc ctttatgagt actctaacaa taggtatata cttagttcct cggctcatga 3180 attctccatg ttgcaaaccc tcttcaagtg ctcaaagttg gtttttcttg ctttctcatc 3240 caaagggatt tgttttttct ttttgcttat gtcagtgtta atttttattg ctttggtttt 3300 gagctgtttc tttaattggt tttcttccat catcattttc tttcttcaat tggttttcaa 3360 cgtttgttgt ggggaaaaaa aataggagcc tggtgtcaag gtttttagct tctgagctag 3420 atcttcgggt gtctttaaag taaaagaaca caatcattct ttatgctgca gtttggattg 3480 aatttcttct caaaatacaa ttcacttgtc tttctttctt ctatttcttt tcttttcctt 3540 gtataagcat aattaatgtt ttgtttttcc ttttctttat ttcaccctta gatgattgtg 3600 atgcatacat gattttgagt tcttggtaca tagatctggt gtattagata gacatagaag 3660 cacaattata agtgtaataa ggtagtagaa acaagtagag ggctgggaaa atgtatgcag 3720 gcatgtgata tcagcctctt tatctcctcc cttgatgtta agtttgctgt ttcctttttc 3780 tttcttttcc atcattcctc ttgaactctg cctctctcct ttactctttt cttgcacata 3840 catgcatgtt tgagtcatct ctagggctaa agagattacc tatagctaaa gctgtcatct 3900 tctcattagt ccaaaccctc ccatctcttc tcacttccaa aatagcacgt cagtcggaca 3960 taagaagaaa agagtacaaa gtcaaattaa atgtgaaaaa aaaagaaagg gtttttttat 4020 atgtcatgtc accaaacaca caaacatata ttactagggt ttcaaaatcc aaatccccaa 4080 atgggtttct tcatcttatt ttatttttcc aaaacaatca ctaggatctc tcaatttagg 4140 attctttttc ctctaattca cacgaatttc acaaaatctc tgttcgaacc cacgtgggga 4200 aagtgaaaag ctttttgttt ttcaagcata gccctagtta gggttcatat ttaagtaacc 4260 acttgaagtc atcaaaattg aaccgaaact ttagtgcaaa ctattcaatc aaccatgtgg 4320 attcttccat aaccagtcaa aaattaagtt agatttcacc tagattttta ccctttttaa 4380 cctcggtaag aagggtacag taactggtta gggtttaata gccagttcaa tatatcagat 4440 tgttgttttg gtttatgaaa agaatctttg gtcacgtcac acacgatttt tcagttcttg 4500 actactgaca aaagggttca agtcatgatt catgaaaatg aaccataaat tttgaactcc 4560 caatctgcaa aaaaaaagaa gaagaagcaa taccacacag aatattgtcc atttcttctc 4620 cttagatccc tccctctctc tgttattttc tttcccatag tgaaagagag atggaacaac 4680 gagaaagggt tagctaaggt catgatgatg ccattgttgg tcattgttga gtgtggtttg 4740 cgtttgttca agatcttgaa tatatgtatg tatgtatgtg tatgtatgta cgcaagttct 4800 tttaggaaga gagagttaat acagagagag aaagagaaga gacgatgtac gacaagtgct 4860 agccaatggg agacttctgt caattttggc ttttttaatg taaatagaag cgtaaaactc 4920 tagcagctgc tgctgctgct cctctctgag agtttcagtc acattcaaga aaacaaaaaa 4980 aaataatttt ttatcttatt acaaatgaaa tagaatatcg aggtgggtaa tagatggtgg 5040 ctcaagagat tatccaccat agaagaaaaa aagaaaaata gatattgccc accatgtaat 5100 gaggaattga caagtgggat agatgtgacc gttgatttag cttggccaat catgttatag 5160 gactcagtga cagcttggca gagacagcca atcactggct cgacgaagtt aaggtatcag 5220 aaaatctaga tatctgggtc ttgtcttaat tgacaaaatg tgttcacatc ttactgttat 5280 tattatggca aaattttagg atgcacaaag aactgggtgg aggttttccc atccatcttc 5340 ttctttaggg ttttaaatga gttaacttaa tggagttatt agttgatttg actctttgat 5400 ttgaatgttc ttaccttaaa atcatggctt aacatcatcc ggtgttggta caggggcatg 5460 attttgctct ctccctcttc agtcaatttc atcatttatt ttgatatata attttctttt 5520 tccctaatgt taagcctcta ctgtcacatc tttaaattac tagagggata tgtaactgat 5580 aactagtaac ttcacataaa atactgtaca atatagattt aaaaaaggaa ttttatataa 5640 aatttgaact ctcaatttta ttttattttt gttatttgga tgagtagttt gtctaaagca 5700 atagctaata tggagggtat taagaaactg cctctagttg ttgacacaaa agccttgaag 5760 cacgtatttt actcgctaat tcacacttct tggctgtgct cttaccatct tggaaaataa 5820 atggatttca aaaaagtaca tggtttctta atctatttga aatgatttaa taatgaaaaa 5880 taaaatcaaa tgtaaagtct taaatgtaat aaaaaataat tttcgatgtt ttttgagtgt 5940 tttttttttt ataattttga aatcttcata tttatatgtc catataaaac ataggaaaaa 6000 tcatcaattc tcaaaattat tggagaaaaa cacctacata tgcattatcg atcaactaca 6060 caaataggaa gtaaccattc gaagaaaatt aaagactaga gacatcaaaa gttgacaaac 6120 atgtgtacat gtgttaatgt aatcgtgtgc aagtgtcatg tcagttgagt taccagtgct 6180 aagtgttgct tccgttaatg ttataacaaa ttatcaatat atgtaagtac taattttaaa 6240 gaatattgct attaaatagt aattaagcta tctttggata catagaagac catgggaagt 6300 gaaaagtttc cttgataaaa ggagttgtgg ttttcaatat atatatatat tcaagaatga 6360 catgagaagt ctcttaatac aggcctccaa tgaaaaggaa atgaagaata attttccaat 6420 tactttcgag taaaaagtta tctatgttca atattttctt ttctttaaaa aaaagagaga 6480 acagaataga agaatgtata gatctgtctg ttttttgttc tttgataact agaatagtgt 6540 atgtttcagt ttcttggtaa gaggttagat gtgaagttct tataatacta taacaaaata 6600 tacttcatca tttgaggggt ggaggagatc ttacaaacaa actactgaac ccaattccct 6660 tttactttta gtaacatccc aattttggtg ttgagatatt gtgataaggt aaagttatat 6720 actttggcca aagtaattta caggatgaag ccttcttctc cctgattagt gactttctga 6780 ggtattaaca tcagagttct gaagattatc caaaaaaaca tcagagttct gaaatttatt 6840 agggccagag atttaatatt cttgaaatcc tcaattgcag ataattatga aatttaagat 6900 caaaatgaaa tatccacagg atcacttatt tatgactaaa atttgttttt ggggtgacat 6960 tcttccacat tattagaatg cacgttgtga tgacacttgc tttttcttga tggaaatgat 7020 gacatgaatg cgcagtgtca aatctacaat tgaaaggtac aaaaaggcat gtgcagattc 7080 ttccaacaac gggtcagttt ctgaagccaa tgctcaggta tcatttatca gctctaacaa 7140 ttgtttacat gtgcagattc ttccaacact ttgttataat cctttgtgtc ctactggttt 7200 ttggttttgg atactgatta gtttgtaatg tatgcactag ggctgaaaaa aggcatacag 7260 aattatgata ttatagaaca aaattaccaa ttaacagtat ttttctttct tttttaataa 7320 attacagtat agtttttcgt gaatttatgt gcgatcgagt gtttacactg aatttcaaaa 7380 tgtgcatgta cgttttgagg ctagtgtaga accacagaaa gacagtatat atggaactac 7440 cagcatataa caaaatcctt tttatgaaat tttatcgtcg atgttttaca ctaaattctc 7500 tcactattca ttaacagcgt aattaacaac atgctgttaa attatagaag gagttcaagc 7560 aatattccta acaatcattt attgccaata tttgtcaaat accctttgtg ataacctcat 7620 ttgtgtaaaa tcgattaaat acatacctat ttaattttgc ttctcagagt ggaggttttt 7680 tcctactgca ttgggagtca tgagtgtaaa cctgcattat agccagtttt gtgtacagaa 7740 acccttttcc ttcctctgtt gctgtggccc tattgtatca atttatttcc agtttgattc 7800 ggtattatat acatgtttcc aagaagtata agagagaaat gtacatcact gatattttct 7860 acttatattt tgagttctaa tctgaactcg aggatcttaa tctagttatt tataatgttt 7920 tattgccttt tgcttttgca tttcagtttt atcagcaaga agctgccaag ctgcgctcgc 7980 aaattggtaa tttgcagaat tcaaacaggt atgatcattt gtgatcttga tcaatttgtt 8040 agataaaatt tgtttttcct cttccaaact ccgtttaagc aaattaattt tcaggaatat 8100 gctgggtgaa tcacttagtg cattgagtgt gaaggaactt aagagcttgg agataaaact 8160 tgagaaagga attggtagaa ttcgttcgaa aaaggtcttt attctagtac tcaaatgatt 8220 ctctcttttt ttaagtcaaa tatcacttta attttccttg tattgccact aacaagtttt 8280 gttttgtctt gttttccttt tgttttttaa ttcctccctc aaacctgcca gaatgagctg 8340 ttgtttgctg aaattgagta tatgcagaag agggtaacaa cttttgtgct catattcacc 8400 atgacttctt ctatttgaga taaaaaaatc aagtttttgc caatttaatg atcctatggt 8460 gaacctcttc tattgtattt tcactccaaa aattttcttt gattcattga atgaaaatgc 8520 aaattgcagg agattgactt gcacaacaat aaccagcttc tccgagcaaa ggtctttcta 8580 cttatctatt tatcaatgcc ttgtgtgtgt ctgaacttgg atcttaatat cttagatcgt 8640 tggtgggttg tttttattta gtaaatatga cactacgtgg ggcttatgtt gatgttgcag 8700 attgcagaga atgaaagaaa gcgacagcac atgaatttga tgccgggagg tgtcaacttc 8760 gagatcatgc agtctcaacc atttgactct cggaactatt ctcaagttaa tggattgccg 8820 cctgccaatc attaccctca tgaagaccag ctcttcagtt agtgtaagta tttcctttgc 8880 aatgagctgt agtttttcat caattaatta ctgatgagca tataattaac tactttgatc 8940 tggatgggtt tcagtagcag cagcggctga atggttcgtg gtctgtaaaa atttattgga 9000 aggatataat aactgatgct gtgccttcta attctcataa tcatttgatc tttcaattag 9060 ttagatgatg atttacgcat tcttattgag atttttacca ttggatgata agagggaatt 9120 gcaatattta gctgttgtac taaaagtaga ctgctgttat cagcacccca tgctcactga 9180 agaactagaa gattacccaa cctagtttta cttcactgaa ccgtttgcat gcaagaactt 9240 aaagcgtaat ctgatttccc aagtgacaag tatatgtttc taactccttg acgaatctgc 9300 tgctgattcc tttgctggtt tatattattc ttatgactac aaacacaata cttttcaact 9360 agctagtgaa tgaataatca tttccttatg ttgcagttaa aaagcaccaa gtgcagcaac 9420 tcctcgcatt tccatattcc atggagagta cctactattt cactgagcgc aaaagctgca 9480 agtacgctaa aacaaaaatc tgaagtagca taactcaaat ttgtgccggt ggagagccta 9540 gtactcttcc tccatgtatt gcttttccag tcccagttaa gacataacaa atgtcagata 9600 aggatttctt ttctgcatgt ttcatgaagg cactaagatg ctgtgacagt acttgtgact 9660 aacttattat atattttgtc ttatatttct tatctttcat cttgtaatat ttcttcgcgt 9720 atctagtatt gcttttcatt caaacccttc cgtgacccag aatcaggacc actgccttag 9780 catgctgttc atcagcggta catgtaatag aggcctctat attttgctgc cagcttaata 9840 tacagtttac atctttcatg tgtgagttca gcacgagtaa ttaattttat ggttattttc 9900 tttgtaacag agcctcttga tgtctatttg taagcattgc gaggttttta aagattaaat 9960 taatacgtaa gctgaatgtc tcgcaaaagg tacaaattgc ttcagct 10007 14 1159 DNA Populus balsamifera subsp. trichocarpa CDS (99)..(815) 14 gaaaccccag agccaaagat ccttactttc tctccttaat aactactatc tctacatccc 60 ctactttggt ttatttcctc ccaagctagg cagcagct atg gca tac caa aat gaa 116 Met Ala Tyr Gln Asn Glu 1 5 ccc caa gag agc tct ccc ctg agg aag ctg ggg agg gga aag gtg gag 164 Pro Gln Glu Ser Ser Pro Leu Arg Lys Leu Gly Arg Gly Lys Val Glu 10 15 20 atc aag cgg atc gag aac acc acc aat cgc caa gtc act ttc tgc aaa 212 Ile Lys Arg Ile Glu Asn Thr Thr Asn Arg Gln Val Thr Phe Cys Lys 25 30 35 agg cgg aat ggt ttg ctc aag aaa gcc tat gaa tta tct gtt ctt tgc 260 Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr Glu Leu Ser Val Leu Cys 40 45 50 gat gct gag gtt gca ctc atc gtc ttc tcc agc cgt gga cgc ctt tat 308 Asp Ala Glu Val Ala Leu Ile Val Phe Ser Ser Arg Gly Arg Leu Tyr 55 60 65 70 gag tac tct aac aat agt gtc aaa tct aca att gaa agg tac aaa aag 356 Glu Tyr Ser Asn Asn Ser Val Lys Ser Thr Ile Glu Arg Tyr Lys Lys 75 80 85 gca tgt gca gat tct tcc aac aac ggg tca gtt tct gaa gcc aat gct 404 Ala Cys Ala Asp Ser Ser Asn Asn Gly Ser Val Ser Glu Ala Asn Ala 90 95 100 cag ttc tat cag caa gaa gct gcc aag ctg cgc tcg caa att ggt aat 452 Gln Phe Tyr Gln Gln Glu Ala Ala Lys Leu Arg Ser Gln Ile Gly Asn 105 110 115 ttg cag aat tca aac agg aat atg ctg ggt gaa tca ctt agt gca ttg 500 Leu Gln Asn Ser Asn Arg Asn Met Leu Gly Glu Ser Leu Ser Ala Leu 120 125 130 agt gtg aag gaa ctt aag agc ttg gag ata aaa ctt gag aaa gga att 548 Ser Val Lys Glu Leu Lys Ser Leu Glu Ile Lys Leu Glu Lys Gly Ile 135 140 145 150 ggt aga att cgt tcg aaa aag aat gag ctg ttg ttt gct gaa att gag 596 Gly Arg Ile Arg Ser Lys Lys Asn Glu Leu Leu Phe Ala Glu Ile Glu 155 160 165 tat atg cag aag agg gag att gac ttg cac aac aat aac cag ctt ctc 644 Tyr Met Gln Lys Arg Glu Ile Asp Leu His Asn Asn Asn Gln Leu Leu 170 175 180 cga gca aag att gca gag aat gaa aga aag cga cag cac atg aat ttg 692 Arg Ala Lys Ile Ala Glu Asn Glu Arg Lys Arg Gln His Met Asn Leu 185 190 195 atg ccg gga ggt gtc aac ttc gag atc atg cag tct caa cca ttt gac 740 Met Pro Gly Gly Val Asn Phe Glu Ile Met Gln Ser Gln Pro Phe Asp 200 205 210 tct cgg aac tat tct caa gtt aat gga ttg ccg cct gcc aat cat tac 788 Ser Arg Asn Tyr Ser Gln Val Asn Gly Leu Pro Pro Ala Asn His Tyr 215 220 225 230 cct cat gaa gac cag ctc ttc agt tag tttaaaaagc accaagtgca 835 Pro His Glu Asp Gln Leu Phe Ser 235 gcaactcctc gcatttccat attccatgga gagtacctac tatttcactg agcgcaaaag 895 ctgcaagtac gctaaaacaa aaatctgaag tagcataact caaatttgtg ccggtggaga 955 gcctagtact cttcctccat gtattgcttt tccagtccca gttaagacat aacaaatgtc 1015 agataaggat ttcttttctg catgtttcat gaaggcacta agatgctgtg acagtacttg 1075 tgactaactt attatatatt ttgtcttata tttcttaaaa aaaaaaaaaa aaaaaaaaaa 1135 aaaaaaaaaa aaaaaaaaaa aaaa 1159 15 714 DNA Populus balsamifera subsp. trichocarpa CDS (1)..(714) 15 atg gca tac caa aat gaa ccc caa gag agc tct ccc ctg agg aag ctg 48 Met Ala Tyr Gln Asn Glu Pro Gln Glu Ser Ser Pro Leu Arg Lys Leu 1 5 10 15 ggg agg gga aag gtg gag atc aag cgg atc gag aac acc acc aat cgc 96 Gly Arg Gly Lys Val Glu Ile Lys Arg Ile Glu Asn Thr Thr Asn Arg 20 25 30 caa gtc act ttc tgc aaa agg cgg aat ggt ttg ctc aag aaa gcc tat 144 Gln Val Thr Phe Cys Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr 35 40 45 gaa tta tct gtt ctt tgc gat gct gag gtt gca ctc atc gtc ttc tcc 192 Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu Ile Val Phe Ser 50 55 60 agc cgt gga cgc ctt tat gag tac tct aac aat agt gtc aaa tct aca 240 Ser Arg Gly Arg Leu Tyr Glu Tyr Ser Asn Asn Ser Val Lys Ser Thr 65 70 75 80 att gaa agg tac aaa aag gca tgt gca gat tct tcc aac aac ggg tca 288 Ile Glu Arg Tyr Lys Lys Ala Cys Ala Asp Ser Ser Asn Asn Gly Ser 85 90 95 gtt tct gaa gcc aat gct cag ttc tat cag caa gaa gct gcc aag ctg 336 Val Ser Glu Ala Asn Ala Gln Phe Tyr Gln Gln Glu Ala Ala Lys Leu 100 105 110 cgc tcg caa att ggt aat ttg cag aat tca aac agg aat atg ctg ggt 384 Arg Ser Gln Ile Gly Asn Leu Gln Asn Ser Asn Arg Asn Met Leu Gly 115 120 125 gaa tca ctt agt gca ttg agt gtg aag gaa ctt aag agc ttg gag ata 432 Glu Ser Leu Ser Ala Leu Ser Val Lys Glu Leu Lys Ser Leu Glu Ile 130 135 140 aaa ctt gag aaa gga att ggt aga att cgt tcg aaa aag aat gag ctg 480 Lys Leu Glu Lys Gly Ile Gly Arg Ile Arg Ser Lys Lys Asn Glu Leu 145 150 155 160 ttg ttt gct gaa att gag tat atg cag aag agg gag att gac ttg cac 528 Leu Phe Ala Glu Ile Glu Tyr Met Gln Lys Arg Glu Ile Asp Leu His 165 170 175 aac aat aac cag ctt ctc cga gca aag att gca gag aat gaa aga aag 576 Asn Asn Asn Gln Leu Leu Arg Ala Lys Ile Ala Glu Asn Glu Arg Lys 180 185 190 cga cag cac atg aat ttg atg ccg gga ggt gtc aac ttc gag atc atg 624 Arg Gln His Met Asn Leu Met Pro Gly Gly Val Asn Phe Glu Ile Met 195 200 205 cag tct caa cca ttt gac tct cgg aac tat tct caa gtt aat gga ttg 672 Gln Ser Gln Pro Phe Asp Ser Arg Asn Tyr Ser Gln Val Asn Gly Leu 210 215 220 ccg cct gcc aat cat tac cct cat gaa gac cag ctc ttc agt 714 Pro Pro Ala Asn His Tyr Pro His Glu Asp Gln Leu Phe Ser 225 230 235 16 238 PRT Populus balsamifera subsp. trichocarpa 16 Met Ala Tyr Gln Asn Glu Pro Gln Glu Ser Ser Pro Leu Arg Lys Leu 1 5 10 15 Gly Arg Gly Lys Val Glu Ile Lys Arg Ile Glu Asn Thr Thr Asn Arg 20 25 30 Gln Val Thr Phe Cys Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr 35 40 45 Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu Ile Val Phe Ser 50 55 60 Ser Arg Gly Arg Leu Tyr Glu Tyr Ser Asn Asn Ser Val Lys Ser Thr 65 70 75 80 Ile Glu Arg Tyr Lys Lys Ala Cys Ala Asp Ser Ser Asn Asn Gly Ser 85 90 95 Val Ser Glu Ala Asn Ala Gln Phe Tyr Gln Gln Glu Ala Ala Lys Leu 100 105 110 Arg Ser Gln Ile Gly Asn Leu Gln Asn Ser Asn Arg Asn Met Leu Gly 115 120 125 Glu Ser Leu Ser Ala Leu Ser Val Lys Glu Leu Lys Ser Leu Glu Ile 130 135 140 Lys Leu Glu Lys Gly Ile Gly Arg Ile Arg Ser Lys Lys Asn Glu Leu 145 150 155 160 Leu Phe Ala Glu Ile Glu Tyr Met Gln Lys Arg Glu Ile Asp Leu His 165 170 175 Asn Asn Asn Gln Leu Leu Arg Ala Lys Ile Ala Glu Asn Glu Arg Lys 180 185 190 Arg Gln His Met Asn Leu Met Pro Gly Gly Val Asn Phe Glu Ile Met 195 200 205 Gln Ser Gln Pro Phe Asp Ser Arg Asn Tyr Ser Gln Val Asn Gly Leu 210 215 220 Pro Pro Ala Asn His Tyr Pro His Glu Asp Gln Leu Phe Ser 225 230 235 17 27 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide primer 17 atgggtcgtg gaaagattga aatcaag 27 18 27 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide primer 18 atttgtgaaa aagagctttt atattta 27 19 27 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide primer 19 aggaaggcga agttcatggg atccaaa 27 20 27 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide primer 20 tccacatcga caaagaagat ctacgat 27 21 27 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide primer 21 gtcactttct gcaaaaggcg cagtggt 27 22 27 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide primer 22 aactaactga agggccatct gatcttg 27 23 27 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide primer 23 atggaatatc aaaatgaatc ccttgag 27 24 29 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide primer 24 attcatgctc tgtcgctttc tttcattct 29 

We claim:
 1. An isolated nucleic acid molecule comprising at least 15 consecutive nucleotides of a nucleic acid sequence selected from the group consisting of Seq. I.D. Nos. 1, 2, 3, 5, 6, 7, 9, 10, 11, 13, 14 and
 15. 2. An isolated nucleic acid molecule according to claim 1 wherein the nucleic acid molecule includes at least 25 consecutive nucleotides of the specified nucleic acid sequence.
 3. An isolated nucleic acid molecule according to claim 1 wherein the nucleic acid molecule includes at least 50 consecutive nucleotides of the specified nucleic acid sequence.
 4. A recombinant nucleic acid molecule comprising a promoter sequence operably linked to a nucleic acid molecule according to claim
 1. 5. A recombinant nucleic acid molecule according to claim 4 wherein the nucleic acid molecule is arranged in antisense orientation relative to the promoter.
 6. A cell transformed with a recombinant nucleic acid molecule according to claim
 4. 7. A cell transformed with a recombinant nucleic acid molecule according to claim
 5. 8. A transgenic plant comprising a recombinant nucleic acid molecule according to claim
 4. 9. A transgenic plant comprising a recombinant nucleic acid molecule according to claim
 5. 10. A transgenic plant according to claim 8 wherein the activity of at least one endogenous gene in the plant is modified as a result of the presence of the recombinant nucleic acid molecule.
 11. A transgenic plant according to claim 10 wherein the plant is a Populus species and the affected endogenous gene is selected from the group consisting of PTD, PTLF, PTAG-1 and PTAG-2.
 12. A transgenic plant according to claim 10 wherein the plant has a modified phenotype relative to non-transgenic plants of the same species.
 13. A transgenic plant according to claim 12 wherein the modified phenotype is a modified fertility phenotype.
 14. A transgenic plant comprising a recombinant nucleic acid molecule, wherein the recombinant nucleic acid molecule comprises a promoter sequence operably linked to a first nucleic acid sequence, and wherein the promoter sequence is a promoter sequence from PTD, PTLF, PTAG-1 or PTAG-2.
 15. A transgenic plant according to claim 14 wherein the first nucleic acid sequence encodes a cytotoxic polypeptide.
 16. A transgenic plant according to claim 14 wherein the plant is a Populus species.
 17. An isolated nucleic acid molecule comprising a nucleotide sequence of at least 50 nucleotides in length wherein said molecule shares at least 75% sequence identity with a nucleic acid selected from the group consisting of Seq. I.D. Nos. 1,2, 3,5, 6, 7,9, 10, 11, 13, 14 and
 15. 18. An isolated nucleic acid molecule according to claim 17 wherein the molecule comprises a nucleotide sequence of at least 100 nucleotides in length and wherein said molecule shares at least 90% sequence identity with a nucleic acid selected from the group consisting of Seq. I.D. Nos. 1, 2, 3, 5, 6, 7, 9, 10, 11, 13, 14 and
 15. 19. A recombinant nucleic acid molecule comprising a promoter sequence operably linked to a nucleic acid molecule according to claim
 17. 20. A recombinant nucleic acid molecule according to claim 19 wherein the nucleic acid molecule is arranged in antisense orientation relative to the promoter.
 21. A cell transformed with a recombinant nucleic acid molecule according to claim
 19. 22. A transgenic plant comprising a recombinant nucleic acid molecule according to claim
 19. 23. A purified protein having an amino acid sequence selected from the group consisting of: (a) Seq. I.D. No. 4; (b) Seq. I.D. No. 8; (c) Seq. I.D. No. 12; (d) Seq. I.D. No. 16; and (e) sequences that differ from (a)-(d) by one or more conservative amino acid substitutions.
 24. An isolated nucleic acid molecule encoding a protein according to claim
 23. 25. An isolated nucleic acid molecule according to claim 24 wherein the nucleic acid molecule comprises a sequence selected from the group consisting of Seq. I.D. Nos. 1, 2, 3, 5, 6, 7, 9, 10, 11, 13, 14 and
 15. 